Real World Application of Machine Learning in Networking
November 22, 2021
Story
Rapidly rising demand for Internet connectivity has put a strain on improving network infrastructure, performance, and other critical parameters. Network administrators have to encounter different types of network running multiple network applications.
Each network application has its own set of features and performance parameters that may change dynamically. Because of the diversity and complexity of networks, conventional algorithms or hard-coded techniques built for such network scenarios is a challenging task.
Machine learning is proven to be beneficial in almost every industry, including the networking industry. Machine learning can help solve the intractable old networking blockers and stimulate new network applications that make networking quite convenient. Let's discuss in detail the basic workflow with a few use cases to better understand applied machine learning technology in the networking domain.
Intelligent network traffic management:
With the growing demand for Internet of Things (IoT) solutions, modern networks generate massive and heterogeneous traffic data. For such a dynamic network, the traditional network management techniques for network traffic monitoring and data analytics like Ping monitoring, Log file monitoring, or even SNMP are not enough. They usually lack accuracy and effective processing of real-time data. On the other hand, traffic from other sources like cellular or mobile devices in the network comparatively shows a more complex behavior due to device mobility and network heterogeneity.
Machine learning facilitates analytics in big data systems as well as large-area networks to recognize complex patterns when it comes to managing such networks. Looking at these opportunities, researchers in the field of networking use deep learning models for Network Traffic Monitoring and Analysis applications like traffic classification and prediction, congestion control, etc.
1. Inband Network Telemetry
Network telemetry data provides basic metrics about network performance. This information is usually quite difficult to interpret. Considering the size and the total data going through in the network holds tremendous value. If used smartly, it can drastically improve performance.
Emerging technologies like Inband-Network Telemetry can help when collecting detailed network telemetry data in real-time. On top of that, running machine learning on such datasets can help correlate phenomena between latency, paths, switches, routers, events, etc., which was difficult to point out from the enormous amounts of real-time data using traditional methods.
Machine learning models are trained to understand correlations and patterns in the telemetry data, which eventually gains the ability to predict the future based on its learning from historical data. This helps in managing future network outages.
2. Resource Allocation and Congestion Control
Every network infrastructure has a predefined total throughput available. It is further split into multiple lanes of different predefined bandwidths. In such scenarios, where the total bandwidth usage for each end-user is statically predefined, there can always be bottlenecks for some part of the network where the network is overwhelmingly used.
To avoid such congestion supervised machine learning models can be trained for analyzing network traffic in real-time and inferring a suitable amount of bandwidth limit per user in such a way that the network experiences the least amount of bottlenecks.
Such models can learn from the network statistics such as total active users per network node, historical network usage data for each user, time-based patterns of data usage, movement of users across multiple access points, and so on.
3. Traffic Classification
In each network, there exists various kinds of traffic like Web Hosting (HTTP), File transfers (FTP), Secure Browsing (HTTPS), HTTP Live Video Streaming (HLS), Terminal Services (SSH), and so on. Now, each of these behaves differently when it comes to network bandwidth usage, transferring a file over FTP. It uses a lot of data continuously.
For example, if a video is being streamed it uses the data in chunks and a buffering method. When different types of traffic are ran in the network in an unsupervised way, some temporary blockages can be seen.
To avoid this, machine learning classifiers can be used to analyze and classify the type of traffic going in the network. These models can then be used to infer network parameters like allocated bandwidth, data caps, etc. to help improve the performance of the network by improving the scheduling of requests served, and also dynamically changing the assigned bandwidths.
Network security:
The increase in the number of cyberattacks forces organizations to constantly monitor and correlate millions of external and internal data points across the whole network infrastructure and its users. Manual management of a large volume of real-time data becomes difficult. This is where machine learning helps.
Machine learning can recognize certain patterns and anomalies in the network and predict threats in massive data sets, all in real-time. By automating such analysis, it becomes easy for network managers to detect threats and isolate situations rapidly with reduced human efforts.
1. Cyber Attack Identification/Prevention
Network behavior is an important parameter in machine learning systems for anomaly detection. Machine learning engines process enormous amounts of data in real-time to identify threats, unknown malware, and policy violations.
If the network behavior is found to be within the predefined behavior the network transaction is accepted, otherwise an alert gets triggered in the system. This can be used to prevent many kinds of attacks like DoS, DDoS, and Probe.
2. Phishing Prevention
It's quite easy to trick someone into clicking a malicious link that seems legitimate and then trying to break through a computer’s defense systems. Machine learning helps in predicting fishy websites to help prevent people from connecting to malicious websites.
For example, a text classifier machine learning model can read and understand URLs and identify those spoofed phishing URLs in the first place. This will create a much safer browsing experience for the end-users.
The integration of machine learning in networking is not limited to the above-mentioned use cases. Solutions can be developed in the field of using ML for networking and network security to solve the unaddressed issues by shedding light on the opportunities and research from both the networking and machine learning perspectives.
Kandarp Rastey is associated with VOLANSYS past 2 years into embedded domain. He has made his contributions in product development services on different Linux based products, in C, C++, Shell Scripts, and Python. He has an in-depth know-how of wireless protocols as well as IoT automation domain. He is an agile tech enthusiast certified by Google Cloud and Stanford Online who is keenly interested and working on developing ML and AI concepts.