Where Edge and Endpoint AI Meet the Cloud
September 22, 2021
Blog
The COVID-19 pandemic created new health and safety requirements that transformed how people interact with each other and their direct environments. The skyrocketing demand for touch-free experiences has in turn accelerated the move toward AI-powered systems and voice-based control and other contactless user interfaces – pushing intelligence closer and closer to the endpoint.
One of the most important trends in the electronics industry today is the incorporation of AI into embedded devices, particularly AI interpreting sensor data such as images and machine learning for alternative user interfaces such as voice.
Embedded Artificial Intelligence of Things (AIoT) is the key to unlocking the seamless, hands-free experience that will help keep users safe in a post-Covid environment. Consider the possibilities: Smart shopping carts that allow you to scan your goods as you drop them in your cart and use mobile payments to bypass the checkout counter, or intelligent video conferencing systems that automatically recognize and switch focus on different speakers during meetings to provide a more ‘in-person’ experience for remote teams.
Why is now the time for an embedded AIoT breakthrough?
AIoT is Moving Out
Initially, AI sat up in the cloud where it took advantage of computational power, memory, and storage scalability levels that the edge and endpoint just could not match. However, more and more, we are seeing not only machine learning training algorithms move out toward the edge of the network, but also a shift from deep learning training to deep learning inference.
Where “training” typically sits in the network core, “inference” now lives at the endpoint where developers can access AI analytics in real time and then optimize device performance, rather than sifting through the device-to-cloud-to-device loop.
Today, most of the inference process runs at the CPU level. However, this is shifting to a chip architecture that integrates more AI acceleration on chip. Efficient AI inference demands efficient endpoints that can infer, pre-process, and filter data in real time. Embedding AI at the chip level, integrating neural processing and hardware accelerators, and pairing embedded-AI chips with special-purpose processors designed specifically for deep learning, offer developers a trifecta of the performance, bandwidth, and real-time responsiveness needed for next-generation connected systems.
Figure 1 (Source: Renesas Electronics)
An AIoT Future: At Home and the Workplace
In addition, a convergence of advancements around AI accelerators, adaptive and predictive control, and hardware and software for voice and vision open up new user interface capabilities for a wide range of smart devices.
For example, voice activation is quickly becoming the preferred user interface for always-on connected systems for both industrial and consumer markets. We have seen the accessibility advantages that voice-control based systems offer for users navigating visual or other physical disabilities, using spoken commands to activate and achieve tasks. With the rising demand for touchless control as a health and safety countermeasure in shared spaces like kitchens, workspaces, and factory floors, voice recognition – combined with a variety of wireless connectivity options – will bring seamless, non-contact experiences into the home and workspace.
Multimodal architectures offer another path for AIoT. Using multiple input information streams improves safety and ease of use for AI-based systems. For example, a voice + vision processing combination is particularly well suited for hands-free AI-based vision systems. Voice recognition activates object and facial recognition for critical vision-based tasks for applications like smart surveillance or hands-free video conferencing systems. Vision AI recognition then jumps in to track operator behavior, control operations, or manage error or risk detection.
On factory and warehouse floors, multimodal AI powers collaborative robots – or CoBots –as part of the technology grouping serving as the five senses that allow CoBots to safely perform tasks side-by-side with their human counterparts. Voice + gesture recognition allows the two groups to communicate in their shared workspace.
What’s on the Horizon?
According to IDC Research, there will be 55 billion connected devices worldwide generating 73 zettabytes of data by 2025, and edge AI chips are set to outpace cloud AI chips as deep learning inference continues to relocate out to the edge and device endpoints. This integrated AI will be the foundation that powers a complex combination of “sense” technologies to create smart applications with more natural, “human-like” communication and interaction.
Dr. Sailesh Chittipeddi is the Executive Vice President and General Manager of the IoT and Infrastructure Business Unit at Renesas.