High-Performance Toolkit for Deep Learning Inference
April 04, 2022
Blog
In recent years, Deep Neural Networks have made significant advances in bringing the accuracy of computer vision algorithms to the next level. OpenVINO toolkit is one such example that enables the optimization of DNN models while providing high performance.
Intel recently released the latest version (2022.1) of its OpenVINO toolkit that provides a simpler deployment for developers anywhere. OpenVINO which is short for “Open Visual Inference and Neural Network Optimization” is a cross-platform deep learning tool kit that offers additional deep-learning models, device portability, and higher inferencing performance with fewer code changes. It focuses on enhancing the deep neural network inference with a write-once, deploy anywhere approach, thus optimizing the development life cycle of the application
The toolkit has two versions, an open-source OpenVINO toolkit and an Intel Distribution of OpenVINO toolkit. The OpenVINO toolkit is mainly used for developing quick solutions to a variety of problems like emulating human vision, speech recognition, natural language processing, recommendation systems, etc. It provides an easier alternative for developers to work on their AI interface and to adopt and maintain their code. OpenVINO is built on the basis of the latest generations of Artificial Neural Networks (ANN), such as Convolutional Neural Networks (CNN) along with recurrent and attention-based networks.
In Intel hardware, OpenVINO encompasses both computer vision and non-computer vision workloads. Through its number of features, it ensures maximum performance and speeds up application development. It offers pre-trained models from its own Open Model Zoo that provides optimized models. OpenVINO has provisions of a Model Optimizer API that can convert the model you provide and prepare it for inferencing. The inference engine allows the user to tune for performance by compiling the optimized network and managing inference operations on specific devices. And since the toolkit is compatible with most frameworks, there are minimum disturbances and maximum performance.
Image Credit: Viso.ai
The above figure shows an application of the OpenVINO toolkit that is using computer vision for intrusion detection. Intel’s Distribution of the OpenVINO toolkit is built to promote and simplify the development, creation, and deployment of high-performance computer vision and deep learning inference applications across widely used Intel platforms. The applications of OpenVINO range from automation and security to agriculture, healthcare, and many more.
Features of Version 2022.1
This version provides bug fixes and capability changes for the previous version 2021.3.
Updated, Cleaner API
This new version makes it easier to maintain the developer's code. It can be integrated with TensorFlow conventions that minimize conversion. This version has reduced API parameters in Model Optimizer to minimize complexity. On the other hand, the performance has been significantly improved for model conversion on Open Neural Network Exchange (ONNX*) models.
Broader Model Support
Users can deploy applications with ease across an expanded range of deep-learning models including natural language processing (NLP), double precision, and computer vision. With a focus on NLP, and an additional category of anomaly detection, the pre-trained models can be used for industrial inspection, noise reduction, question answering, translation, and text to speech.
Portability and Performance
This version promises a performance boost with automatic device discovery, load balancing, and dynamic inferencing parallelism across CPU, GPU, and more.
- OpenVINO Toolkit Add-ons
- Computer Vision Annotation Tool
- Dataset Management Framework
- Deep Learning Streamer
- Neural Network Compression Framework
- OpenVINO Model Server
- OpenVINO Security Add-On
- Training Extensions
Working of OpenVINO
Image Credit: Viso.ai
OpenVINO toolkit consists of a variety of development and deployment tools, which includes a fully configured set of pre-trained models and hardware for evaluation. The following steps describe the working of OpenVINO:
Prerequisites: Set up of OpenVINO
Before getting started with the actual workflow, make sure you select your host, target platform, and models. This tool supports operating systems such as Linux, Windows, macOS, and Raspbian. As for deep learning model training frameworks, it supports TensorFlow, Caffe, MXNet, Kaldi, as well as the Open Neural Network Exchange (ONNX) model format.
Step 1: Training your Model
The first step is to prepare and train a deep learning model. You can either find a pre-trained model from the Open Model Zoo or build your own model. OpenVINO provides validated support for public models and offers a collection of code samples and demos within the repository. You can use scripts to configure the Model Optimizer for the framework that is used to train the model.
Step 2: Converting and Optimizing your Model
After the model is configured, you can run the Model Optimizer to convert your model to an intermediate representation (IR), which is represented in a pair of files (.xml and .bin). Along with the pair of files (.xml and .bin), the Model Optimizer also helps by outputting diagnostic messages to aid in further tuning.
Step 3: Tuning for Performance
In this step, the Inference Engine is used to compile the optimized model. The Inference Engine is a high-level (C, C++, or Python*) inference API that is implemented as dynamically loaded plug-ins for each hardware type. It delivers the optimal performance for each hardware without any need to maintain multiple code pathways.
Step 4: Deploying Applications
The Inference Engine is used to deploy the applications. Using the deployment manager, you can create a development package by assembling the model, IR files, your application, and associated dependencies into a runtime package for your target device.
To sum it up, this new version of the OpenVINO toolkit provides numerous benefits and not only optimizes the experience for users to deploy an application, but also enhances the performance parameters. It gives users the power to develop applications with easy deployment, more deep-learning models, more device portability, and higher inferencing performance with fewer code changes.