Container Performance Doesn’t Need to Suck

Recently the OpenShift team at Red Hat, working with Solarflare Engineering, rolled out new code that was benchmarked by a third party, STAC Research, which demonstrated networking performance from within a container that was equivalent to that of a bare metal server. We’re talking 1.2 microseconds for 99% of network traffic in a 1/2RT (half round trip), that’s a TCP receive to an application coupled with a TCP send from that application.

Network performance like this was considered leading edge in High-Performance Computing (HPC) a little more than a decade ago when Myricom rolled out Myrinet10G which debuted at 2.4 microseconds back in 2006. Both networks are 10Gbps so it’s sort of an apples to apples comparison. Today, this level of performance is available for containerized applications using generic network socket calls. It should be noted that the above numbers were for zero byte packets, a traditional HPC measurement. More realistic performance using 256-byte packets yielded a 1/2RT time for the 99th percentile of traffic which was still under 1.5 microseconds, that’s amazing! It should be noted that everything was done to both the bare metal server and the Pod configuration to optimize performance. A graph of the complete results of that testing is shown below.

Anytime we create abstractions to simplify application execution or management we introduce additional layers of code that can result in potentially unwanted delays, known as application latency. By running an application inside a container, then wrapping that container into a Pod we are increasing the distance between what we intend to do, and what is actually being executed. Docker containers are fast becoming all the rage and methods for orchestrating them using tools like Kubernetes are extremely popular. If you dive into this OpenShift blog post there are ways to cut through these layers of code for performance while still retaining the primary management benefits.

Onload Recovers Meltdown Lost Performance

The recently announced microprocessor architecture vulnerability known as Meltdown is focused on accessing memory that shouldn’t be available to the currently running program. Meltdown exploits a condition where the processor allows an unprivileged application the capability to continually harvest data unrestricted from anywhere in system memory. The flaw that enables Meltdown is based on microprocessor performance enhancements more than a decade old and are now common in Intel and some ARM processors. The solution to Meltdown is Kernel page-table isolation (KPTI), but it doesn’t come without a performance impact which ranges from 5-30%, every application behaves differently. Since Onload places the communications stack into that application’s userspace this dramatically reduces the number of kernel calls for network operations and as such avoids most of the performance impact brought on by KPTI. Redhat confirm this in a recent article on this topic. This means that applications leveraging Onload on KPTI patched kernels will see an even greater performance advantage.

By contrast Spectre tears down the isolation that exists between running applications. It allows a malicious application to trick error-free programs into leaking their secrets. It does this by scanning the process address space of those programs, and the kernel libraries on which they depend, looking for exploitable code. When this vulnerable code is executed it acts as a covert channel transmitting its secrets to the malicious application. This vulnerability affects a wider range of processors and requires both kernel and CPU microcode patches, and even then, the vulnerability hasn’t been 100% eliminated. More work remains to be done to shut down Spectre.