Saturday, November 16, 2024
HomeTechnologyClockwork's Cloud Deluxe platform eliminates packet drops, improves cloud network performance

Clockwork’s Cloud Deluxe platform eliminates packet drops, improves cloud network performance


Clockwork today announced a new service that uses its clock synchronization service to eliminate packet drops to help businesses improve their network performance.

A year ago, the company created a splash when it announced its clock synchronization service that helps businesses keep their server fleets in sync. Keeping clocks in sync with up to 5-nanosecond accuracy (for hardware-based timestamps) is quite an achievement, but the idea here was always to go up the stack and build tools on top of this fundamental technology. The first tool, Latency Sensei, provides users with fine-grained data about latency in their networks. Now, Clockwork is bundling this tool with other features and a new ‘sense-and-control’ dashboard for managing them all, aiming to help businesses reduce network latency, jitter, and virtually eliminate packet drops between their machines, regardless of location or computing environments.

Traditionally, to reduce packet drops (and those drops and their retransmissions are fundamental feature of TCP that make the internet work), network switches use buffers. But as Clockwork co-founder and Stanford computer science professor Balaji Prabhakar noted during an interview at KubeCon Europe earlier this month, this comes with a lot of overhead.

One-way measurements, the Clockwork teams argues, is a far more accurate indicator of congestion than packet drops and in the company’s demo, simply turning on this Packet Rocket congestion control feature, as the company calls it, drops packet loss to almost zero while reducing latency and increasing bandwidth utilization. That almost sounds too good to be true, but Clockwork can back these claims up and the company already has a number of enterprise customers that have successfully tested the overall Cloud Deluxe platform.

“If we have accurate clocks in networks, the first thing we do is measure congestion very accurately — one way, not round-trip time divided by two. Second, if you can do the one-way measurement fast and accurately, then you can actually control congestion in a way that you couldn’t do before,” Prabhakar explained. “Because most of the problem with network congestion, if you don’t want to go towards packet drops — that’s the nuclear option — if you don’t want to go anywhere near that, people always say: here’s a buffer.” Figuring out how to measure congestion accurately is difficult, however, and even with a large buffer, it takes a number of packet drops (and the overhead associated with those) before the system reaches the necessary threshold to kick into action.

Image Credits: Clockwork Systems

With this core technology in place, Clockwork can then also easily allocate bandwidth to a given virtual machine and/or prioritize traffic based on the needs of a given application. With the latency data in place, the company can also discover which machines are likely co-located on the same physical host, allowing businesses to move workloads to avoid the noisy siblings problem.

“Clockwork Systems is helping us gain better visibility into our complex multi-cloud environment,” said Albert Greenberg, Vice President of Platform Engineering at Uber. “Clockwork‘s breakthrough technology can pinpoint congestion bottlenecks with accurate latency measurements — and fix the problem by killing packet delays and eliminating packet drops. We’re impressed by the trials so far, and we’re exploring the potential for Clockwork‘s Cloud Deluxe software to help us build high-performance network infrastructure on top of generic cloud environments.”

Soon, Prabhakar told me, the company will also be able to enable better snapshotting of the network state for backups and disaster recovery. Traditionally, to get an accurate snapshot of the network state, you would have to pause the application, wait for the packets to reach their destination and then create the snapshot. But with more accurate clocks, it’s possible to simply say: all the VMs take a break for a few nanoseconds, wait a few milliseconds for everything that’s still in flight to land, take the snapshot, and resume.

All of this is now powered by Clockwork’s UniChron API, which allows users to set dynamic bounds on clock accuracy and is controlled through the company’s new interactive control panel. The company also offers programmatic access to all features through APIs.

 

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments