How to Scale IoT Past Distributed Data Silos with the Zenoh Protocol
September 28, 2022
Story
To reduce latency, network utilization, and cost, many IoT deployments now store and analyze data at or near the edge node. But “distributed” can be a bad thing when it comes to data, particularly if it means information gets trapped in silos across a network.
So what happens when you inevitably need it?
Let’s start at the data source. For data in motion, technologies built around publish-subscribe tenets were designed to deal with this type of environment. In a publish-subscribe network like MQTT or DDS, data related to a given topic is broadcast by a publisher across the network, and nodes on the network subscribe to that topic for updates. This promotes decentralized data networking that maps nicely to the evolution of IoT networks, as well as the broader network infrastructure considering 5G networks deploy 1.4-2x base stations compared to 4G to support the increase in edge workloads.
At their best, protocols like MQTT and DDS run over TCP or UDP in homogeneous environments with little-to-no packet loss and a high degree of endpoint fanout. This allows them to transmit messages from node to node at high speed with minimal overhead. But as tools for data in motion, what they don’t provide is a built-in, location-aware data retrieval mechanism as they were designed to push one message and move on to the next.
For data at rest, technologies like named-data networking (NDN) provide similar data centricity by allowing packets to be labeled something other than just their destination address. Packets, which can be named anything, are cached in location-aware content stores that give users the opportunity to access them post-transmission by querying the designated label. However, NDN was designed as an Internet technology, which doesn’t fit well with the latency and resource-constrained environments of many end applications.
This means IoT developers must support multiple connectivity stacks to distribute and retrieve data in a performance-, resource-, and latency-sensitive manner.
Unifying Data in Motion and At Rest From Edge to Cloud
Since the inception of IoT, the goal has been to unify data distribution and retrieval architectures under a single enterprise-to-edge paradigm rather than patch together heterogeneous platforms and technology stacks. ZettaScale Technology was founded earlier this year to bridge the gap, in part through a technology called Zenoh.
Zenoh is a protocol that addresses data in transit, in use, and at rest by blending publish-subscribe architectures with geographically distributed storage. It can be used with common IP transports or Zigbee, Thread, or pretty much any other edge data link in peer-to-peer, routed, or mesh topologies that mirror heterogenous edge-to-cloud IoT networks. It is currently an open-source project hosted by the Eclipse Foundation.
Here's how it works. Zenoh broadcasts data to subscribers using a “key expression,” which is essentially a string containing resource identifiers. For example, the key expression that identifies temperature sensors in the Louvre in Paris would specify the floor, room number, asset, and asset type. Targeting a specific asset, say a temperature sensors in room 42 on the second floor of the Louvre museum in Paris, would be done with an expression like:
Louvre/2/42/sensor/temp
Unlike normal packets, this string is something a developer can understand and potentially query from a database. This leads us to the third Zenoh abstraction besides publishers and subscribers: queryables.
Queryables contain all the values for a given key expression so the protocol can deposit any published data related to that expression into a data store. Correspondingly, this allows the network to be queried for data related to those queryables, and Zenoh supports a Storage Manager and other plugins for integrating filesystems, databases, and the like so queries can be run on historical data as well.
Zenoh supports push, pull, and get commands for use with its simple yet powerful semantics. Returning to our previous building example, all that a developer needs to retrieve the temperature information for all rooms on the second floor of the Louvre is issue a get command with the expression:
Louvre/2/*/sensor/temp
Rust, Python, and C APIs are available to simplify app integration.
Because Zenoh is publish-subscribe, results are always retrieved from the nearest data store or compute node containing the information requested. The protocol also includes a data caching feature that allows sleeping nodes to pull whatever data they need from the nearest infrastructural node when required then return to sleep.
The Cost of Data Scalability
But features almost always come at a cost, and usually when you’re adding enterprise-class querying capabilities to the edge that cost is in the form of performance, resources, or both. So how does Zenoh stack up with the pub-sub alternatives?
The protocol inurs a wire overhead of between just 4 and 6 bytes, making it microcontroller-compatible, while being able to transmit up to 4 million messages per second. When compared with MQTT and DDS, Zenoh has a 75 and 64 percent smaller wire overhead, respectively. According to ZettaScale, it also exhibits 40x the throughput performance of MQTT and 10x that of XRCE-DDS. Benchmark transport latency with the new protocol is just 15 µs.
These performance metrics have caught the attention of autonomous vehicle developers at the Indy Autonomous Challenge and TTTech Auto, the latter of whom is working with ZettaScale on an ISO 26262-compliant version of the Zenoh protocol.
It truly has been designed from the ground up to easily scale vertically or horizontally, across multiple subnets, from edge to cloud.
If you want to see just how far you can scale with Zenoh, check out the comprehensive Getting Started guide or see what community members are doing with the protocol on Github.