Network Data Science


Multi-task Weak Supervision

The application of machine learning (ML) to mitigate network-related problems poses significant challenges for researchers and operators alike. For one, there is a general lack of labeled training data in networking, and labeling techniques popular in other domains are ill-suited due to the scarcity of operators’ domain expertise. Second, network problems are typically multi-tasked in nature, requiring multiple ML models (one per task) and resulting in multiplicative increases in training times as the number of tasks increases. Third, the adoption of ML by network operators hinges on the models’ ability to provide basic reasoning about their decision-making procedures. To address these challenges, we propose ARISE, a multi-task weak supervision framework for network measurements. ARISE uses weak supervision-based data programming to label network data at scale and applies learning paradigms such as multitask learning (MTL) and meta-learning to facilitate information sharing between tasks as well as reduce overall training time. Using community datasets, we show that ARISE can generate MTL models with improved classification accuracy compared to multiple single-task learning (STL) models. We also report findings that show the promise of MTL models for providing a means for reasoning about their decision-making process, at least at the level of individual tasks.

Publications

Team

  • Jared Knofczynski (UO)
  • Prof. Ram Durairajan (UO)
  • Dr. Walter Willinger (NIKSUN Inc.)

Funding

  • This material is based upon work supported by the National Science Foundation (NSF) NSF CNS 1850297 and NSF OAC 2126281. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF.

De-noising Delay Measurements

Understanding the delay characteristics of the Internet is one of the key goals of Internet measurement researchers, service providers, and content delivery networks. To this end, a myriad of measurement tools and techniques were proposed by the researchers in academia and industry, and datasets from such measurement tools are curated to facilitate analyses at a later time.

Despite the benefits of the proposed tools to measure the delay characteristics of the Internet and the availability of datasets from measurement efforts, what is critically lacking is a systematic framework to interpret the results from the tools and datasets. The key hinderance to creating this framework is measurement noise, which we define as the presence of non-representative and erroneous values in the delay measurements. Noise confounds all types of end-to-end delay measurements and can lead to performance issues and unnecessary operational decisions. State-of-the-art denoising techniques are (1) time consuming and labor intensive: they are done manually due to the lack of ground truth data required to classify and discern noise from the actual delay behaviors of the network, and (2) ineffective: they are too naïve with simple heuristics and filters, which are impractical and which can lead to unnecessary operational decisions.

To tackle these challenges, this research will develop a systematic weak supervision-based framework and culminate in NoMoNoise for denoising Internet delay measurements in an automated and rapid fashion.

Publications

  • ARISE: A Multi-Task Weak Supervision Framework for Network Measurements
    Jared Knofczynski, Ramakrishnan Durairajan and Walter Willinger
    In IEEE JSAC Series on Machine Learning in Communications and Networks, July 2022.
    [PAPER] [CODE]
  • Challenges in Using ML for Networking Research: How to Label If You Must
    Yukhe Lavinia, Ramakrishnan Durairajan, Reza Rejaie and Walter Willinger
    In Proceedings of Workshop on Network Meets AI & ML (NetAI’20)
    co-located with ACM SIGCOMM’20, New York, USA, August 2020.
    [PAPER] [SLIDES] [VIDEO]
  • An Integrated Micro-Metrics Monitoring Framework for Tackling Distributed Heterogeneity
    Babar Khalid*, Nolan Rudolph*, Ramakrishnan Durairajan and Sudarsun Kannan
    In Proceedings of 12th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’20)
    co-located with Usenix ATC’20, Massachusetts, USA, July 2020.
    [PAPER] (* co-primary authors)
  • Denoising Internet Delay Measurements using Weak Supervision
    Anirudh Muthukumar and Ramakrishnan Durairajan
    In Proceedings of IEEE ICMLA’19, Florida, USA, December 2019.
    [PAPER]
  • Can We Containerize Internet Measurements?
    Christopher Misa, Sudarsun Kannan and Ramakrishnan Durairajan
    In Proceedings of ACM/IRTF/ISOC Applied Networking Research Workshop (ANRW’19)
    co-located with IETF 105, Montreal, Canada, July 2019.
    [PAPER]

Team

  • Prof. Ram Durairajan (UO)
  • Prof. Reza Rejaie (UO)
  • Jared Knofczynski (UO)
  • Chris Misa (UO)
  • Yukhe Lavinia (UO)
  • Nolan Rudolph (UO)
  • Anirudh Muthukumar (UO)
  • Prof. Sudarsun Kannan (Rutgers University)
  • Babar Khalid (Rutgers University)
  • Dr. Walter Willinger (NIKSUN Inc.)

Funding

This material is based upon work supported by the National Science Foundation (NSF) NSF CNS 1850297. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF.

Forecasting Network Streams

This study examines the accuracy and overhead of using LSTM and SARIMA for short-term forecasting of network data stream in telemetry systems.

Today’s data plane network telemetry systems enable network operators to capture fine-grained data streams of many different network traffic features (e.g., loss or flow arrival rate) at line rate. This capability facilitates data-driven approaches to network management and motivates leveraging either statistical or machine learning models (e.g., for forecasting network data streams) for automating various network management tasks. However, current studies on network automation- related problems are in general not concerned with issues that arise when deploying these models in practice (e.g., (re)training overhead).

This study examines various training-related aspects that affect the accuracy and overhead (and thus feasibility) of both LSTM and SARIMA, two popular types of models used for forecasting real-world network data streams in telemetry systems. In particular, we study the impact of the size, choice, and recency of the training data on accuracy and overhead and explore using separate models for different segments of a data stream (e.g., per-hour models). Using two real-world data streams, we show that (i) per-hour LSTM models exhibit high accuracy after training with only 24 hours of data, (ii) the accuracy of LSTM models does not depend on the recency of the training data (i.e., no frequent (re)training is required), (iii) SARIMA models can have comparable or lower accuracy than LSTM models, and (iv) certain segments of the data streams are inherently more challenging to forecast than others. While the specific findings reported in this study depend on the considered data streams and specified models, we argue that irrespective of the data streams at hand, a similar examination of training-related aspects is needed before deploying any statistical or machine learning model in practice.

Publications

  • On the Practicality of Learning Models for Network Telemetry
    Soheil Jamshidi, Zayd Hammoudeh, Daniel Lowd, Ramakrishnan Durairajan, Reza Rejaie and Walter Willinger
    In Proceedings of TMA’20, Berlin, Germany, June 2020. [Acceptance rate 33%]
    [PAPER]

Team

  • Soheil Jamshidi (UO)
  • Zayd Hammoudeh (UC Santa Cruz)
  • Prof. Daniel Lowd (UO)
  • Prof. Ram Durairajan (UO)
  • Prof. Reza Rejai (UO)
  • Dr. Walter Willinger (NIKSUN Inc.)

Funding

This material is based upon work supported by the National Science Foundation (NSF) NSF CNS 1850297. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF.

Application Behavior Modeling

This study shows how network, systems and application features can be used to train and utilize unsupervised models for detecting 4 types of attacks.

In this study, we consider 4 types of attacks: three slow HTTP attacks (Shekyan, 2020) and a session flooding attack (BoNeSi DDoS tool) on Apache web server given its popularity compared to other web servers (datanyze.com, 2020). Due to the differences in the mechanism of these attacks, each has a different footprint on the traffic attributes. We utilize three types of features including network, operation system, and application, to detect attacks in an unsupervised manner and develop a practical and accurate system for anomaly detection. We further analyze the contribution of each of these feature sources through different model training strategies and model interpretation techniques. Our analyses show that additional attributes from the application and operating system improved the accuracy of our model. The unsupervised neural network model is capable of differentiating among anomalies and normal behavior of the application. Furthermore, we illustrate how interpretation methods can facilitate the process of examining anomalies for network admins.

Team

  • Soheil Jamshidi (UO)
  • Prof. Ram Durairajan (UO)
  • Prof. Reza Rejai (UO)
  • Dr. Walter Willinger (NIKSUN Inc.)

Funding

This material is based upon work supported by the National Science Foundation (NSF) NSF CNS 1850297 and UO VPRI. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF.