Bachelor Projects

Auto-Tuning of Management Configuration Parameters

Supervisor: Mahmoud Alasmar.
Status: available.
Date: 1/03/2024.
Management algorithms used by orchestration frameworks, such as Kubernetes, rely on a number of configuration parameters. Manual setting of these parameters based on prior experience and post-deployment monitoring results in suboptimal states, which in turn affect the throughput and latency of the system. Because of the increase in system scale, diversity of workloads, and number of configuration parameters, more robust techniques have become essential in tuning such parameters. Metis [1] and SelfTune [2] are two recent proposals that are aimed at auto-tuning of system parameters. These solutions are based on different optimization algorithms; the former uses the Bayesian optimization algorithm for finding an optimum state, while the latter uses reinforcement learning. In this project, you will work on exploring and evaluating both solutions, focusing on highlighting the pros and cons by comparing the complexity, optimality, and scalability of each method and the cases where each method would be more applicable.

References:

Mining sales data to identify customer profiles, and predict sales (with industrial partner)

Supervisor: Dilek Dustegor.
Status: available.
Date: 1/02/2024.

Mining sensors data for anomaly detection (with industrial partner)

Supervisor: Dilek Dustegor.
Status: available.
Date: 1/02/2024.

Machine learning model to optimize the trajectory of a robotic arm (with industrial partner)

Supervisor: Dilek Dustegor.
Status: available.
Date: 1/02/2024.

Data Driven Leakage Detection in Water Network

Supervisor: Dilek Dustegor.
Status: multiple available spots.
Date: 15/01/2024.
Leaks in water distribution networks (WDNs) are one of the main reasons for water loss during transportation. Considering water scarcity, combined with a growing population worldwide, it is an urgent humanitarian need to minimize water losses. Lately, some attempts have been made to use data-driven and machine learning techniques for leakage localization. But capabilities and limitations of these methods are not clearly understood. In this project, the student will develop, optimize, and compare several machine learning models for leak detection purposes in a water network.

References:

In-network Computing supporting Real-Time Complex Event Processing.

Supervisor: Bochra Boughzala.
Status: available
Date: 15/01/2024.
Complex Event Processing [1, 2] is a powerful paradigm for event pattern recognition allowing the detection of complex events e.g., fire, from a set of simple events e.g., high temperature and level of smoke. The timely and efficient detection of complex events is crucial when complex-event processing must be performed in real-time. In this context, in-network computing [3] is a promising approach for accelerating real-time complex event processing [4]. With high-performance and programmable data planes, we can reduce the latency and improve the system throughput. However, in-network programming models e.g., P4 programming language present its own challenges [5]. In this project, we focus on how in-network computing can support the real-time complex event processing of a real-world telemetry dataset [6]. The dataset in [6] is based on fine-grained telemetry data of over 200k hard drives in data center deployment. We aim to identify the most meaningful set of attributes for hard drive failure detection using in-network computing models.

References:

Pre-training Graph Neural Networks on solving classical graph theory algorithms.

Supervisor: Andrés Tello.
Status: available
Date: 08/12/2023.
Graph Neural Networks (GNNs) are a proven approach for solving different predictive problems on graph-structured data. Previous research has shown that GNNs trained to solve classical graph theory algorithms (e.g., shortest paths) can generalize to graphs of larger sizes than those present in the training set. The aim of this work is to evaluate whether pre-training GNNs to solve such algorithms equip them with generalization capabilities to solve unrelated downstream predictive tasks in the Water Management domain. The student need to (1) train a GNN model using different strategies that involve structural features of the graphs, e.g., shortest paths, minimum spanning trees, diameter, betweenness centrality, etc, and (2) use the pre-trained model to fine-tune a GNN model on pressure/flow reconstruction in Water Distribution Networks.

References:

Positional encodings in Graph Neural Networks for Geo-Located data.

Supervisor: Andrés Tello.
Status: available
Date: 08/12/2023.
Positional encodings have shown to be an effective technique in Natural Language Processing for learning vector representations of words based on word positions in a sentence and their contexts. Its success has also attracted researchers in the field of Graph Neural Networks to learn vector representations of the nodes in the graph which encodes not only their features but also their location or relative positions with respect other nodes. In Water Distribution Networks the geo-location of the nodes is provided. Thus, the aim of this project is to explore different techniques for learning positional encodings which leverage the geo-coordinates of the nodes. The student will propose a GNN-based model that learn positional encodings for the nodes and combine them with vector representations of their features. The model will be evaluated on state estimation tasks in Water Distribution Networks.

References:

Graph Neural Networks for Metamodeling in Water Distributed Systems

Supervisor: Huy Truong.
Status: available.
Date: 8/12/2023.
Physics-based simulation has been an essential tool in monitoring drinking water distribution systems (WDS). It takes a bundle of hydraulic parameters to output helpful tracking measurements such as pressure, head, and demand. This method is known as a metamodeling task. However, while performing this task, the simulation tool is associated with considerable issues, such as inefficient performance, periodic calibration, and inflexibility in unseen scenarios. Alternatively, data-driven solutions have emerged as surrogate models that can alleviate the aforementioned problems. In this study, we aim to (i) construct a gigantic dataset collected from numerous public benchmark WDSs and (ii) utilize it to train a Graph Neural Network, one of the data-driven approaches, for solving the metamodeling in water distribution systems. The prerequisites involve a Machine Learning background and experience with a deep learning framework (Pytorch, Tensorflow, or JAX).

References:

Node masking in Graph Neural Networks

Supervisor: Huy Truong.
Status: available.
Date: 8/12/2023.
Working with data in the real world often leads to missing information problems, which can negatively affect the performance of deep learning models. However, in proper ways, it can boost the expressiveness of Graph Neural Network (GNN) models in node representation learning through a technique known as Node Masking. In particular, it hides arbitrary nodal features in a graph and instructs the GNN to recover the missing parts. The student can explore diverse masking strategies, such as zero masking, random node replacement, mean-neighbor substitution, shared learnable embedding, and nodal permutation. These options above should be compared and evaluated in a graph reconstruction task that applies to a water distribution network. This study will focus on finding a generative technique that effectively enhances the performance of GNN models in semi-supervised transductive learning. Students interested in joining this project should possess a machine-learning background and a deep-learning framework.

References:

Improving Performance of Network Traffic Flows

Supervisor: Saad Saleh.
Status: available.
Date: 7/12/2023.
With the growing user base, the Internet has been proliferated by a range of traffic categories requiring different Quality of Service (QoS) guarantees in terms of delay, throughout and jitter for the end applications. Consequently, the underlying network transport mechanisms, like congestion control and flow control, are facing challenges in providing satisfactory QoS to the end applications. In this project, we aim to develop novel techniques for reducing the network congestion. The project would focus on the understanding of current active queue management (AQM) algorithms and development of novel mechanisms for improving the performance of current AQM techniques.

Towards AI-enabled Network Functions

Supervisor: Saad Saleh.
Status: available.
Date: 7/12/2023.
The Internet relies on network functions, like queue management and load balancing, for providing satisfactory quality-of-service (QoS) to end users. However, the current network functions rely on fixed programmed rules which limits their ability in handling anomalous activity. In this project, Deep Learning-based approaches would be implemented and analyzed for the current network functions. The project focuses on programming a neural network and integrating it with the network simulator for performance analysis. The Active queue management (AQM) algorithms including RED, CoDel, PIE, CAKE, and BLUE etc. would be studied for implementation of the Deep-learning based approaches.

How Complex Event Processing Can Benefit from Federated Learning

Supervisor: Majid Lotfian Delouee.
Status: available.
Date: 01/12/2023.
Complex event processing system (CEP) is a paradigm to analyze input streams (e.g., IoT sensory data) to generate high-level information in real-time. To achieve a higher quality of results, a CEP middleware requires as much as possible data while data owners are not willing to deliver them due to privacy reasons. Federated Learning (FL) is one of the machine learning approaches that allows data owners to train learning models locally and send the models instead of raw sensed data. This ensures a higher level of preserving privacy while improving the quality of results. In this research project, students are asked to elaborate on the main components and pros and cons of both CEP and FL. Finally, students are expected to discuss and propose novel ideas to show the possibilities of improving the performance of CEP systems using the FL paradigm.

Federated Rule Mining in Complex Event Processing

Supervisor: Majid Lotfian Delouee.
Status: Taken (unavailable).
Date: 01/12/2023.
The Complex Event Processing system (CEP) represents a crucial paradigm for real-time analysis of dynamic input streams. Typically, during the design phase, domain experts define CEP rules that enable the detection of relevant situations. However, the challenge arises from the highly dynamic nature of the environment. Parameters like thresholds or window sizes for complex event detection often require real-time adjustments. Additionally, the diversity in data sources necessitates the continuous definition of new rules to leverage various data streams for more confident situation detection. On the other hand, Federated Learning (FL) allows data owners to locally train learning models, transmitting only the models instead of raw data. This not only preserves privacy at a higher level but also enhances result quality. In the context of this research project, students are tasked with a comprehensive exploration of the main components and pros and cons inherent in both CEP and FL. The focal point lies in delving into innovative ideas to showcase the potential of adaptively generating new rules while concurrently updating existing ones.

Transfer Learning for Short Term Load Forecasting

Supervisor: Dilek Dustegor.
Status: available.
Date: 24/01/2023.
Accurate short term load forecasting is essential in modern energy systems. This has become possible in recent years with the application of multiple machine learning and deep learning models, provided that significant training data is available. New buildings do not have historical data to feed a model to predict their energy consumption. To counteract the lack of historical data, the literature reports Transfer Learning (TL) as a potential solution. TL consists in developing models on existing source domain (e.g. different building), then the pre-trained models are used on a the target domain (i.e. new building). In this project, the student will develop, optimize and compare deep learning based transfer learning models for energy consumption forecasting, utilizing historical consumption data from the Enernoc dataset.

Modeling and analysis of process models using Colored Petri nets

Supervisor: Heerko Groefsema.
Status: Available.
Date: 31/10/2022.
Petri nets are mathematical models that can be used to model distributed systems. In our research, we use XML-based Place/Transition nets to obtain verifiable models of business processes. To increase expressivity, we are interested in support for Colored Petri nets. In this project, the student will investigate Colored Petri nets, design and implement Colored Petri nets using the Petri Net Markup Language, and implement its semantics.

Verification of Security and Privacy concepts in BPMN Choreography diagrams

Supervisor: Heerko Groefsema.
Status: Available.
Date: 31/10/2022.
Where process models define the flow of activities of participants, choreographies describe interactions between participants. Within such interactions, the security and privacy related concepts of separation of duties and division of knowledge are important. The former specifies that no one person has the privileges to misuse the system, either by error or fraudulent behavior, while the latter defines the absence of total knowledge within a single person, such that the knowledge can not be abused. The problem is, how do we specify such concepts and what kind of model is required to verify these concepts? In this project we ask the student to devise an approach to formally specify and verify these concepts given a BPMN Choreography Diagram.

Obtaining Alignments from Transition Graphs

Supervisor: Heerko Groefsema.
Status: Available.
Date: 31/10/2022.
The practice of checking conformance of business process models has revolutionized the industry through the amount of insight it creates into the process flows of businesses. Conformance checking entails matching an event log (which details events of past executions) against a business process model (which details the prescribed process flow) through a so called alignment. Any deviation from the prescribed process flow is detected and reported. Generally, alignments are obtained by matching the so called token replay of process models (e.g., Petri nets) against events in logs. Our Transition Graphs are also obtained from token replays, but offer further insight into parallel executions than regular Reachability Graphs. As a result, we are interested in the applicability of obtaining alignments using Transition Graphs, especially when matched against event logs that include lifecycle events and thus offer parallel execution data. In this project we ask the student to implement and evaluate the applicability of such an approach.

Compliance and conformance checking logs in Zeebe

Supervisor: Heerko Groefsema.
Status: Available.
Date: 31/10/2022.
The practice of checking conformance of business process models has revolutionized the industry through the amount of insight it creates into the process flows of businesses. Conformance checking entails matching an event log (which details events of past executions) against a business process model (which details the prescribed process flow). As a result, logging has become extremely important for any business process execution engine, such as Zeebe: the business process execution engine of the open-source Camunda framework. For our research we are interested in detailed event logs of process executions, that include information such as the lifecycle state of tasks, and output to common event log formats. In this project we ask the student to assess current event logs, assess the logging capabilities of Zeebe, and implement a package featuring detailed extended and customizable logging and live event hooks.

Have your own project suggestions?

We are available to supervise projects on various aspects of distributed systems, in particular involving

Service-Oriented and Cloud Computing
Pervasive Computing and Smart Environments
Network Centric Real-time Analytics
Energy Distribution Infrastructures
Adaptive Communication Middleware

If you have an idea of a specific project or would like to work generally in a specific area, please let us know about it and we can then narrow the project down.

Please feel free to contact us to discuss specific topics and options.

Visiting Address

Projects

Bachelor Projects

Auto-Tuning of Management Configuration Parameters

Mining sales data to identify customer profiles, and predict sales (with industrial partner)

Mining sensors data for anomaly detection (with industrial partner)

Machine learning model to optimize the trajectory of a robotic arm (with industrial partner)

Data Driven Leakage Detection in Water Network

In-network Computing supporting Real-Time Complex Event Processing.

Pre-training Graph Neural Networks on solving classical graph theory algorithms.

Positional encodings in Graph Neural Networks for Geo-Located data.

Graph Neural Networks for Metamodeling in Water Distributed Systems

Node masking in Graph Neural Networks

Improving Performance of Network Traffic Flows

Towards AI-enabled Network Functions

How Complex Event Processing Can Benefit from Federated Learning

Federated Rule Mining in Complex Event Processing

Transfer Learning for Short Term Load Forecasting

Modeling and analysis of process models using Colored Petri nets

Verification of Security and Privacy concepts in BPMN Choreography diagrams

Obtaining Alignments from Transition Graphs

Compliance and conformance checking logs in Zeebe

Have your own project suggestions?