Distributed Systems

Bachelor Projects

Auto-Tuning of Management Configuration Parameters

Supervisor: Mahmoud Alasmar.
Status: available.
Date: 1/03/2024.
Management algorithms used by orchestration frameworks, such as Kubernetes, rely on a number of configuration parameters. Manual setting of these parameters based on prior experience and post-deployment monitoring results in suboptimal states, which in turn affect the throughput and latency of the system. Because of the increase in system scale, diversity of workloads, and number of configuration parameters, more robust techniques have become essential in tuning such parameters. Metis [1] and SelfTune [2] are two recent proposals that are aimed at auto-tuning of system parameters. These solutions are based on different optimization algorithms; the former uses the Bayesian optimization algorithm for finding an optimum state, while the latter uses reinforcement learning. In this project, you will work on exploring and evaluating both solutions, focusing on highlighting the pros and cons by comparing the complexity, optimality, and scalability of each method and the cases where each method would be more applicable.

References:

  1. Zhao Lucis Li and Chieh-Jan Mike Liang and Wenjia He and Lianjie Zhu and Wenjun Dai and Jin Jiang and Guangzhong Sun, Metis: Robustly Tuning Tail Latencies of Cloud Systems, 2018 USENIX Annual Technical Conference 981--992, https://www.usenix.org/conference/atc18/presentation/li-zhao
  2. Ajaykrishna Karthikeyan and Nagarajan Natarajan and Gagan Somashekar and Lei Zhao and Ranjita Bhagwan and Rodrigo Fonseca and Tatiana Racheva and Yogesh Bansal, SelfTune: Tuning Cluster Managers, 20th USENIX Symposium on Networked Systems Design and Implementation 1097--1114, https://www.usenix.org/conference/nsdi23/presentation/karthikeyan

Mining sales data to identify customer profiles, and predict sales (with industrial partner)

Supervisor: Dilek Dustegor.
Status: available.
Date: 1/02/2024.

Mining sensors data for anomaly detection (with industrial partner)

Supervisor: Dilek Dustegor.
Status: available.
Date: 1/02/2024.

Machine learning model to optimize the trajectory of a robotic arm (with industrial partner)

Supervisor: Dilek Dustegor.
Status: available.
Date: 1/02/2024.

Data Driven Leakage Detection in Water Network

Supervisor: Dilek Dustegor.
Status: multiple available spots.
Date: 15/01/2024.
Leaks in water distribution networks (WDNs) are one of the main reasons for water loss during transportation. Considering water scarcity, combined with a growing population worldwide, it is an urgent humanitarian need to minimize water losses. Lately, some attempts have been made to use data-driven and machine learning techniques for leakage localization. But capabilities and limitations of these methods are not clearly understood. In this project, the student will develop, optimize, and compare several machine learning models for leak detection purposes in a water network.

References:

  1. Marcos Quiñones-Grueiro, Marlon Ares Milián, Maibeth Sánchez Rivero, Antônio J. Silva Neto, Orestes Llanes-Santiago, "Robust leak localization in water distribution networks using computational intelligence," Neurocomputing 438 (2021) 195–208, https://doi.org/10.1016/j.neucom.2020.04.159
  2. Chan-Wook Lee and Do-Guen Yoo, "Development of Leakage Detection Model and Its Application for Water Distribution Networks Using RNN-LSTM," Sustainability 2021, 13, 9262, https://doi.org/10.3390/su13169262
  3. Jie Zhang, Xiaoping Yang and Juan Li, "Leak localization of water supply network based on temporal convolutional network," Meas. Sci. Technol. 33 (2022) 125302 (8pp), https://doi.org/10.1088/1361-6501/ac8ca5
  4. Zahra Fereidooni, Hooman Tahayori, Ali Bahadori‑Jahromi, "A hybrid model‑based method for leak detection in large scale water distribution networks," Journal of Ambient Intelligence and Humanized Computing (2021) 12:1613–1629, https://doi.org/10.1007/s12652-020-02233-2

In-network Computing supporting Real-Time Complex Event Processing.

Supervisor: Bochra Boughzala.
Status: available
Date: 15/01/2024.
Complex Event Processing [1, 2] is a powerful paradigm for event pattern recognition allowing the detection of complex events e.g., fire, from a set of simple events e.g., high temperature and level of smoke. The timely and efficient detection of complex events is crucial when complex-event processing must be performed in real-time. In this context, in-network computing [3] is a promising approach for accelerating real-time complex event processing [4]. With high-performance and programmable data planes, we can reduce the latency and improve the system throughput. However, in-network programming models e.g., P4 programming language present its own challenges [5]. In this project, we focus on how in-network computing can support the real-time complex event processing of a real-world telemetry dataset [6]. The dataset in [6] is based on fine-grained telemetry data of over 200k hard drives in data center deployment. We aim to identify the most meaningful set of attributes for hard drive failure detection using in-network computing models.

References:

  1. Luckham, D.C. and Frasca, B., 1998. Complex event processing in distributed systems. Computer Systems Laboratory Technical Report CSL-TR-98-754. Stanford University, Stanford, 28, p.16.
  2. Buchmann, A. and Koldehofe, B., 2009. Complex event processing.
  3. Tokusashi, Y., Dang, H.T., Pedone, F., Soulé, R. and Zilberman, N., 2019, March. The case for in-network computing on demand. In Proceedings of the Fourteenth EuroSys Conference 2019 (pp. 1-16).
  4. Boughzala, B. and Koldehofe, B., 2021, June. Accelerating the performance of data analytics using network-centric processing. In Proceedings of the 15th ACM International Conference on Distributed and Event-based Systems (pp. 192-195).
  5. Gebara, N., Lerner, A., Yang, M., Yu, M., Costa, P. and Ghobadi, M., 2020, November. Challenging the stateless quo of programmable switches. In Proceedings of the 19th ACM Workshop on Hot Topics in Networks (pp. 153-159).
  6. Hard drive dataset.

Pre-training Graph Neural Networks on solving classical graph theory algorithms.

Supervisor: Andrés Tello.
Status: available
Date: 08/12/2023.
Graph Neural Networks (GNNs) are a proven approach for solving different predictive problems on graph-structured data. Previous research has shown that GNNs trained to solve classical graph theory algorithms (e.g., shortest paths) can generalize to graphs of larger sizes than those present in the training set. The aim of this work is to evaluate whether pre-training GNNs to solve such algorithms equip them with generalization capabilities to solve unrelated downstream predictive tasks in the Water Management domain. The student need to (1) train a GNN model using different strategies that involve structural features of the graphs, e.g., shortest paths, minimum spanning trees, diameter, betweenness centrality, etc, and (2) use the pre-trained model to fine-tune a GNN model on pressure/flow reconstruction in Water Distribution Networks.

References:

  1. Xu, K., Zhang, M., Li, J., Du, S. S., Kawarabayashi, K. I., & Jegelka, S. (2020, October). How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks. In International Conference on Learning Representations.
  2. Veličković, P., Ying, R., Padovano, M., Hadsell, R., & Blundell, C. (2019, September). Neural Execution of Graph Algorithms. In International Conference on Learning Representations.
  3. Corso, G., Cavalleri, L., Beaini, D., Liò, P., & Veličković, P. (2020). Principal neighbourhood aggregation for graph nets. Advances in Neural Information Processing Systems, 33, 13260-13271.
  4. Xu, K., Li, J., Zhang, M., Du, S. S., Kawarabayashi, K. I., & Jegelka, S. (2019). What can neural networks reason about?. arXiv preprint arXiv:1905.13211.

Positional encodings in Graph Neural Networks for Geo-Located data.

Supervisor: Andrés Tello.
Status: available
Date: 08/12/2023.
Positional encodings have shown to be an effective technique in Natural Language Processing for learning vector representations of words based on word positions in a sentence and their contexts. Its success has also attracted researchers in the field of Graph Neural Networks to learn vector representations of the nodes in the graph which encodes not only their features but also their location or relative positions with respect other nodes. In Water Distribution Networks the geo-location of the nodes is provided. Thus, the aim of this project is to explore different techniques for learning positional encodings which leverage the geo-coordinates of the nodes. The student will propose a GNN-based model that learn positional encodings for the nodes and combine them with vector representations of their features. The model will be evaluated on state estimation tasks in Water Distribution Networks.

References:

  1. Klemmer, K., Safir, N. S., & Neill, D. B. (2023, April). Positional encoder graph neural networks for geographic data. In International Conference on Artificial Intelligence and Statistics (pp. 1379-1389). PMLR.
  2. Fuchs, F., Worrall, D., Fischer, V., & Welling, M. (2020). Se (3)-transformers: 3d roto-translation equivariant attention networks. Advances in neural information processing systems, 33, 1970-1981.
  3. Mai, G., Janowicz, K., Yan, B., Zhu, R., Cai, L., & Lao, N. (2020). Multi-scale representation learning for spatial feature distributions using grid cells. arXiv preprint arXiv:2003.00824.
  4. You, J., Ying, R., & Leskovec, J. (2019, May). Position-aware graph neural networks. In International conference on machine learning (pp. 7134-7143). PMLR.

Graph Neural Networks for Metamodeling in Water Distributed Systems

Supervisor: Huy Truong.
Status: available.
Date: 8/12/2023.
Physics-based simulation has been an essential tool in monitoring drinking water distribution systems (WDS). It takes a bundle of hydraulic parameters to output helpful tracking measurements such as pressure, head, and demand. This method is known as a metamodeling task. However, while performing this task, the simulation tool is associated with considerable issues, such as inefficient performance, periodic calibration, and inflexibility in unseen scenarios. Alternatively, data-driven solutions have emerged as surrogate models that can alleviate the aforementioned problems. In this study, we aim to (i) construct a gigantic dataset collected from numerous public benchmark WDSs and (ii) utilize it to train a Graph Neural Network, one of the data-driven approaches, for solving the metamodeling in water distribution systems. The prerequisites involve a Machine Learning background and experience with a deep learning framework (Pytorch, Tensorflow, or JAX).

References:

  1. Kerimov, Bulat, et al. "Assessing the performances and transferability of graph neural network metamodels for water distribution systems." Journal of Hydroinformatics (2023): jh2023031.
  2. Klise, Katherine, Hart, David, Bynum, Michael, Hogge, Joseph, Haxton, Terranna, Murray, Regan, & Burkhardt, Jonathan. Water Network Tool for Resilience (WNTR). User Manual, Version 0.2.3. United States. https://doi.org/10.2172/1660790
  3. Kipf, T. N., & Welling, M. (2017). Semi-Supervised Classification with Graph Convolutional Networks. International Conference on Learning Representations.

Node masking in Graph Neural Networks

Supervisor: Huy Truong.
Status: available.
Date: 8/12/2023.
Working with data in the real world often leads to missing information problems, which can negatively affect the performance of deep learning models. However, in proper ways, it can boost the expressiveness of Graph Neural Network (GNN) models in node representation learning through a technique known as Node Masking. In particular, it hides arbitrary nodal features in a graph and instructs the GNN to recover the missing parts. The student can explore diverse masking strategies, such as zero masking, random node replacement, mean-neighbor substitution, shared learnable embedding, and nodal permutation. These options above should be compared and evaluated in a graph reconstruction task that applies to a water distribution network. This study will focus on finding a generative technique that effectively enhances the performance of GNN models in semi-supervised transductive learning. Students interested in joining this project should possess a machine-learning background and a deep-learning framework.

References:

  1. Hou, Zhenyu, Xiao Liu, Yuxiao Dong, Chunjie Wang, and Jie Tang. "GraphMAE: Self-Supervised Masked Graph Autoencoders." arXiv preprint arXiv:2205.10803(2022).
  2. Abboud, Ralph, Ismail Ilkan Ceylan, Martin Grohe, and Thomas Lukasiewicz. "The surprising power of graph neural networks with random node initialization." arXiv preprint arXiv:2010.01179 (2020).
  3. Hajgató, Gergely, Bálint Gyires-Tóth, and György Paál. "Reconstructing nodal pressures in water distribution systems with graph neural networks." arXiv preprint arXiv:2104.13619 (2021).
  4. He, Kaiming, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Girshick. "Masked autoencoders are scalable vision learners." In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16000-16009. 2022.

Improving Performance of Network Traffic Flows

Supervisor: Saad Saleh.
Status: available.
Date: 7/12/2023.
With the growing user base, the Internet has been proliferated by a range of traffic categories requiring different Quality of Service (QoS) guarantees in terms of delay, throughout and jitter for the end applications. Consequently, the underlying network transport mechanisms, like congestion control and flow control, are facing challenges in providing satisfactory QoS to the end applications. In this project, we aim to develop novel techniques for reducing the network congestion. The project would focus on the understanding of current active queue management (AQM) algorithms and development of novel mechanisms for improving the performance of current AQM techniques.

  1. R. Adams, "Active Queue Management: A Survey," in IEEE Communications Surveys & Tutorials, vol. 15, no. 3, pp. 1425-1476, Third Quarter 2013, doi: 10.1109/SURV.2012.082212.00018.
  2. I. Järvinen and M. Kojo, "Evaluating CoDel, PIE, and HRED AQM techniques with load transients," 39th Annual IEEE Conference on Local Computer Networks, Edmonton, AB, Canada, 2014, pp. 159-167, doi: 10.1109/LCN.2014.6925768.
  3. N. Khademi, D. Ros and M. Welzl, "The new AQM kids on the block: An experimental evaluation of CoDel and PIE," 2014 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Toronto, ON, Canada, 2014, pp. 85-90, doi: 10.1109/INFCOMW.2014.6849173.

Towards AI-enabled Network Functions

Supervisor: Saad Saleh.
Status: available.
Date: 7/12/2023.
The Internet relies on network functions, like queue management and load balancing, for providing satisfactory quality-of-service (QoS) to end users. However, the current network functions rely on fixed programmed rules which limits their ability in handling anomalous activity. In this project, Deep Learning-based approaches would be implemented and analyzed for the current network functions. The project focuses on programming a neural network and integrating it with the network simulator for performance analysis. The Active queue management (AQM) algorithms including RED, CoDel, PIE, CAKE, and BLUE etc. would be studied for implementation of the Deep-learning based approaches.

  1. Saad Saleh and Boris Koldehofe. "The Future is Analog: Energy-Efficient Cognitive Network Functions over Memristor-Based Analog Computations." HotNets 2023: Twenty-Second ACM Workshop on Hot Topics in Networks. ACM New York, NY, USA, 2023. doi: 10.1145/3626111.3628192
  2. Minsu Kim, Muhammad Jaseemuddin, and Alagan Anpalagan. "Deep reinforcement learning based active queue management for iot networks." Journal of Network and Systems Management 29, no. 3 (2021): 34. doi: 10.1007/s10922-021-09603-x
  3. Y. Xu, W. Xu, Z. Wang, J. Lin and S. Cui, "Load Balancing for Ultradense Networks: A Deep Reinforcement Learning-Based Approach," in IEEE Internet of Things Journal, vol. 6, no. 6, pp. 9399-9412, Dec. 2019, doi: 10.1109/JIOT.2019.2935010.

How Complex Event Processing Can Benefit from Federated Learning

Supervisor: Majid Lotfian Delouee.
Status: available.
Date: 01/12/2023.
Complex event processing system (CEP) is a paradigm to analyze input streams (e.g., IoT sensory data) to generate high-level information in real-time. To achieve a higher quality of results, a CEP middleware requires as much as possible data while data owners are not willing to deliver them due to privacy reasons. Federated Learning (FL) is one of the machine learning approaches that allows data owners to train learning models locally and send the models instead of raw sensed data. This ensures a higher level of preserving privacy while improving the quality of results. In this research project, students are asked to elaborate on the main components and pros and cons of both CEP and FL. Finally, students are expected to discuss and propose novel ideas to show the possibilities of improving the performance of CEP systems using the FL paradigm.

  1. Banabilah, S., Aloqaily, M., Alsayed, E., Malik, N. and Jararweh, Y., 2022. Federated learning review: Fundamentals, enabling technologies, and future applications. Information Processing & Management, 59(6), p.103061.
  2. Dayarathna, M. and Perera, S., 2018. Recent advancements in event processing. ACM Computing Surveys (CSUR), 51(2), pp.1-36.
  3. Roldán, J., Boubeta-Puig, J., Martínez, J.L. and Ortiz, G., 2020. Integrating complex event processing and machine learning: An intelligent architecture for detecting IoT security attacks. Expert Systems with Applications, 149, p.113251.

Federated Rule Mining in Complex Event Processing

Supervisor: Majid Lotfian Delouee.
Status: Taken (unavailable).
Date: 01/12/2023.
The Complex Event Processing system (CEP) represents a crucial paradigm for real-time analysis of dynamic input streams. Typically, during the design phase, domain experts define CEP rules that enable the detection of relevant situations. However, the challenge arises from the highly dynamic nature of the environment. Parameters like thresholds or window sizes for complex event detection often require real-time adjustments. Additionally, the diversity in data sources necessitates the continuous definition of new rules to leverage various data streams for more confident situation detection. On the other hand, Federated Learning (FL) allows data owners to locally train learning models, transmitting only the models instead of raw data. This not only preserves privacy at a higher level but also enhances result quality. In the context of this research project, students are tasked with a comprehensive exploration of the main components and pros and cons inherent in both CEP and FL. The focal point lies in delving into innovative ideas to showcase the potential of adaptively generating new rules while concurrently updating existing ones.

  1. Simsek, Mehmet Ulvi, Feyza Yildirim Okay, and Suat Ozdemir. "A deep learning-based CEP rule extraction framework for IoT data." The Journal of Supercomputing 77 (2021): 8563-8592.
  2. Lv, Jiayao, Bihui Yu, and Huajun Sun. "CEP rule extraction framework based on evolutionary algorithm." In 2022 11th International Conference of Information and Communication Technology (ICTech)), pp. 245-249. IEEE, 2022.
  3. Roldán-Gómez, José, Jesús Martínez del Rincon, Juan Boubeta-Puig, and José Luis Martínez. "An automatic unsupervised complex event processing rules generation architecture for real-time IoT attacks detection." Wireless Networks (2023): 1-18.

Transfer Learning for Short Term Load Forecasting

Supervisor: Dilek Dustegor.
Status: available.
Date: 24/01/2023.
Accurate short term load forecasting is essential in modern energy systems. This has become possible in recent years with the application of multiple machine learning and deep learning models, provided that significant training data is available. New buildings do not have historical data to feed a model to predict their energy consumption. To counteract the lack of historical data, the literature reports Transfer Learning (TL) as a potential solution. TL consists in developing models on existing source domain (e.g. different building), then the pre-trained models are used on a the target domain (i.e. new building). In this project, the student will develop, optimize and compare deep learning based transfer learning models for energy consumption forecasting, utilizing historical consumption data from the Enernoc dataset.

  1. Yassine Himeur, Mariam Elnour, Fodil Fadli, Nader Meskin, Ioan Petri, Yacine Rezgui, Faycal Bensaali, Abbes Amira, "Next-generation energy systems for sustainable smart cities: Roles of transfer learning," Sustainable Cities and Society 85 (2022) 104059, https://doi.org/10.1016/j.scs.2022.104059
  2. Y.-K. Juan, P. Gao, and J. Wang, "A hybrid decision support system for sustainable office building renovation and energy performance improvement," Energy and Buildings 42-3 (2010), https://doi.org/10.1016/j.enbuild.2009.09.006
  3. Y. Ahn and B. S. Kim, "Prediction of building power consumption using transfer learning based reference building and simulation dataset," Energy and Buildings 258 (2022), https://doi.org/10.1016/j.enbuild.2021.111717

Modeling and analysis of process models using Colored Petri nets

Supervisor: Heerko Groefsema.
Status: Available.
Date: 31/10/2022.
Petri nets are mathematical models that can be used to model distributed systems. In our research, we use XML-based Place/Transition nets to obtain verifiable models of business processes. To increase expressivity, we are interested in support for Colored Petri nets. In this project, the student will investigate Colored Petri nets, design and implement Colored Petri nets using the Petri Net Markup Language, and implement its semantics.

  1. Kurt Jensen and Lars M. Kristensen. 2015. Colored Petri nets: a graphical language for formal modeling and validation of concurrent systems. Commun. ACM 58, 6 (June 2015), 61–70.
  2. Hillah, Lom M., et al. "A primer on the Petri Net Markup Language and ISO/IEC 15909-2." Petri Net Newsletter 76 (2009): 9-28.
  3. BPMPetriNet package

Verification of Security and Privacy concepts in BPMN Choreography diagrams

Supervisor: Heerko Groefsema.
Status: Available.
Date: 31/10/2022.
Where process models define the flow of activities of participants, choreographies describe interactions between participants. Within such interactions, the security and privacy related concepts of separation of duties and division of knowledge are important. The former specifies that no one person has the privileges to misuse the system, either by error or fraudulent behavior, while the latter defines the absence of total knowledge within a single person, such that the knowledge can not be abused. The problem is, how do we specify such concepts and what kind of model is required to verify these concepts? In this project we ask the student to devise an approach to formally specify and verify these concepts given a BPMN Choreography Diagram.

  1. OMG. Business process model and notation (BPMN) version 2.0, 2011.
  2. Pullonen, Pille & Matulevičius, Raimundas & Bogdanov, Dan. (2017). PE-BPMN: Privacy-Enhanced Business Process Model and Notation. 40-56.
  3. BPMVerification package

Obtaining Alignments from Transition Graphs

Supervisor: Heerko Groefsema.
Status: Available.
Date: 31/10/2022.
The practice of checking conformance of business process models has revolutionized the industry through the amount of insight it creates into the process flows of businesses. Conformance checking entails matching an event log (which details events of past executions) against a business process model (which details the prescribed process flow) through a so called alignment. Any deviation from the prescribed process flow is detected and reported. Generally, alignments are obtained by matching the so called token replay of process models (e.g., Petri nets) against events in logs. Our Transition Graphs are also obtained from token replays, but offer further insight into parallel executions than regular Reachability Graphs. As a result, we are interested in the applicability of obtaining alignments using Transition Graphs, especially when matched against event logs that include lifecycle events and thus offer parallel execution data. In this project we ask the student to implement and evaluate the applicability of such an approach.

  1. H. Groefsema, N.R.T.P. van Beest, and M. Aiello (2016) A Formal Model for Compliance Verification of Service Compositions. IEEE Transactions on Service Computing.
  2. Carmona, Josep, et al. "Conformance checking." Switzerland: Springer.[Google Scholar] (2018).
  3. BPMVerification package

Compliance and conformance checking logs in Zeebe

Supervisor: Heerko Groefsema.
Status: Available.
Date: 31/10/2022.
The practice of checking conformance of business process models has revolutionized the industry through the amount of insight it creates into the process flows of businesses. Conformance checking entails matching an event log (which details events of past executions) against a business process model (which details the prescribed process flow). As a result, logging has become extremely important for any business process execution engine, such as Zeebe: the business process execution engine of the open-source Camunda framework. For our research we are interested in detailed event logs of process executions, that include information such as the lifecycle state of tasks, and output to common event log formats. In this project we ask the student to assess current event logs, assess the logging capabilities of Zeebe, and implement a package featuring detailed extended and customizable logging and live event hooks.

  1. Zeebe: Distributed Workflow Engine for Microservices Orchestration
  2. "IEEE Standard for eXtensible Event Stream (XES) for Achieving Interoperability in Event Logs and Event Streams," in IEEE Std 1849-2016 , vol., no., pp.1-50, 11 Nov. 2016
  3. Hompes, B. F. A. "Artifact Lifecycle Extension."

Have your own project suggestions?

We are available to supervise projects on various aspects of distributed systems, in particular involving

  • Service-Oriented and Cloud Computing
  • Pervasive Computing and Smart Environments
  • Network Centric Real-time Analytics
  • Energy Distribution Infrastructures
  • Adaptive Communication Middleware

If you have an idea of a specific project or would like to work generally in a specific area, please let us know about it and we can then narrow the project down.

Please feel free to contact us to discuss specific topics and options.