Stratus and Solarflare for Capital Markets and Exchanges

by David Whitney, Director of Global Financial Services, Stratus

The partnership of Stratus, the global standard for fault-tolerant hardware solutions, and Solarflare, the unchallenged leader in application network acceleration for financial services, at face value seems like an odd one. Stratus ‘always on’ server technology removes all single points of failure, which eliminates the need to write and maintain costly code to ensure high availability and fast failover scenarios.  Stratus and high performance are rarely been used in the same sentence.

Let’s go back further… Throughout the 1980’s and 90’s Stratus, and their proprietary VOSS operating system, globally dominated financial services from exchanges to investment banks. In those days, the priority for trading infrastructures was uptime which was provided by resilient hardware and software architectures. With the advent of electronic trading, the needs of today’s capital markets have shifted. High-Frequency Trading (HFT) has resulted in an explosion in transactional volumes. Driven by the requirements of one of the largest stock exchanges in the world, they realized that critical applications need to not only be highly available but also extremely focused on performance (low latency) and deterministic (zero jitter) behavior.

Stratus provides a solution that guarantees availability in mission-critical trading systems, without the costly overhead associated with today’s software-based High Availability (HA) solutions as well as the need for multiple physical servers. You could conceivably cut your server footprint in half by using a single Stratus server where before you’d need at least two physical servers. Stratus is also a “drop and go” solution. No custom code needs to be written, there is no concept of Stratus FT built customer applications. This isn’t just for Linux environments, Stratus also has hardened OS solutions for Windows and VMWare as well.

Solarflare brings low latency networking to the relationship with their custom Ethernet controller ASIC and Onload Linux Operating System Bypass communications stack. Normally network traffic arrives at the server’s network interface card (NIC) and is passed to the Operating System through the host CPU. This process involves copying the network data several times and switching the CPU’s context from kernel to user mode one or more times. All of these events take both time and CPU cycles. With over a decade of R&D, Solarflare has considerably shortened this path. Under Solarflare’s control applications often receive data in about 20% of the time it would typically take. The savings are measured in microseconds (millionths of a second), typically several or more. In trading speed often speed matters most, so a dollar value can be placed on this savings. Back in 2010, one trader valued the savings at $50,000/micro-second for each day of trading.

Both Stratus and Solarflare have worked together to dramatically reduce jitter to nearly zero. Jitter is caused by those seemingly inevitable events that distract a CPU core from its primary task of electronic trading. For example, the temperature of thermal sensor somewhere in the system may exceed a predetermined level and it raises a system interrupt. A CPU core is then assigned to handle that interrupt and determine which fan needs to be turned on or sped up. While this event, known as “Jitter”, sounds trivial the distraction to processes this interrupt and return to trading often results in a delay measured in the 100’s of microseconds. Imagine you’re trading strategy normally executes in 10s of microseconds, network latency adds 1-2 microseconds, and then all the sudden the system pauses your trading algorithm for 250 microseconds while it does some system housekeeping. By the time control is returned to your algorithm it’s possible that the value of what you’re trading has changed. Both Stratus and Solarflare have worked exceedingly hard to remove Jitter from the FT platform.

Going forward, Solarflare and Stratus will be adding Precision Time Protocol support to a new version of Onload for the Stratus FT Platform.

Leave a Reply