Container Performance Doesn’t Need to Suck

Recently the OpenShift team at Red Hat, working with Solarflare Engineering, rolled out new code that was benchmarked by a third party, STAC Research, which demonstrated networking performance from within a container that was equivalent to that of a bare metal server. We’re talking 1.2 microseconds for 99% of network traffic in a 1/2RT (half round trip), that’s a TCP receive to an application coupled with a TCP send from that application.

Network performance like this was considered leading edge in High-Performance Computing (HPC) a little more than a decade ago when Myricom rolled out Myrinet10G which debuted at 2.4 microseconds back in 2006. Both networks are 10Gbps so it’s sort of an apples to apples comparison. Today, this level of performance is available for containerized applications using generic network socket calls. It should be noted that the above numbers were for zero byte packets, a traditional HPC measurement. More realistic performance using 256-byte packets yielded a 1/2RT time for the 99th percentile of traffic which was still under 1.5 microseconds, that’s amazing! It should be noted that everything was done to both the bare metal server and the Pod configuration to optimize performance. A graph of the complete results of that testing is shown below.

Anytime we create abstractions to simplify application execution or management we introduce additional layers of code that can result in potentially unwanted delays, known as application latency. By running an application inside a container, then wrapping that container into a Pod we are increasing the distance between what we intend to do, and what is actually being executed. Docker containers are fast becoming all the rage and methods for orchestrating them using tools like Kubernetes are extremely popular. If you dive into this OpenShift blog post there are ways to cut through these layers of code for performance while still retaining the primary management benefits.