Distributed Systems > CS > JBI > FWN > RUG

Available Projects

Many projects can be adjusted so that they fit the constraints of a Master's term, a Bachelor project, or an internship. Although some projects may have been available for some time, this does not mean they have become less relevant.

Internal Projects

  • [TAKEN] Reinforcement learning, planning, and Serverless architectures. Serverless computing is a new trend in which tiny applications are spawned whenever they are needed. Such architectures tend to scale well, and potentially scale to zero running instances, whenever a good scheduling service is available. With a large number of services available, composing higher level applications (consisting of serverless functions) can be a challenge. In this project the student will look into the use of reinforcement learning for creating planned pipelines in serverless architectures. The project consists of a literature study and an implementation component. Contact: Frank Blaauw.~
  • [TAKEN] Data science in water management. The current water infrastructure in the northern part of the Netherlands generates a large amount of data. In this project, the student is asked to work on one (or more) data science projects related to water management. This includes, but is not limited to: leakage detection, usage prediction, anomaly detection, missing data reparation, GIS analysis, and many more. Findings from this research might find their way back into the production systems of the water management company we work with. Please contact Frank if you would like to hear more about the available use cases. Contact: Frank Blaauw.
  • [TAKEN] Containers for analysis. Containerization is currently a hot topic in software engineering. Many companies rewrite their traditional applications for make use of this new and promising technology. In data science, however, not much is currently being done with containerization. In this project the student do a comparison of different container types (e.g., Docker, RKT, Singularity, etc.) and will analyse which one works best for the purpose of data analysis. Furthermore, in an ongoing project in which containers will be used for performing data science, the student can help with implementing the optimal container platform. Contact: Frank Blaauw.
  • Verification of Service Compositions. Originally designed to support rigid repetitive units of work, business processes currently are required to support flexible and variable processes implemented as service compositions. These flexible compositions, however, must remain true to its initial process requirements and business rules. We developed a Java package that uses a model checking approach to verify the compliance of compositions against sets of formal specifications. The package takes a Petri net and a set of specifications as input, internally converts and optimizes the composition to a verifiable model, verifies the set of specifications against the model, and returns the results of verification. The package is included in the Apromore process analytics platform as a plugin. In this context, the following assignments are available:
    1. Implement BPMN file format to Petri net PNML file format conversion (BSc).
    2. Implement WS-BPEL file format to Petri-net PNML file format conversion (BSc).
    3. Implement UML activity diagram to Petri-net PNML file format conversion (BSc).
    4. Implement EPC file format to Petri-net PNML file format conversion (BSc).
    5. Develop direct support for BPMN file format verification (BSc/int).
    6. Develop direct support for WS-BPEL file format verification (BSc/int).
    7. Develop direct support for UML activity diagram verification (BSc/int).
    8. Develop direct support for EPC file format verification (BSc/int).
    9. Develop an internal verification algorithm (MSc).
    10. Develop a visual specification design plugin for Apromore (MSc).
    Contact: Heerko.
  • [TAKEN] OnlineSuperLearner - large scale machine learning. The Distributed Systems group (University of Groningen), Developmental Psychology (University of Groningen), and MAP5 (Université Paris Descartes) are collaborating on a new, scalable implementation of the so-called Online SuperLearner. With this meta machine learning algorithm we aim to improve clinical decision making and psychopathology research. The Master's student involved in this project will help set up a scalable architecture, capable of dealing with tremendous amounts of data. Technologies of interest are, for example, Spark, Scala, Kafka, Hadoop, and many others. More information: doc. Contact: Frank Blaauw.
  • [TAKEN] Highly distributed in-browser computing. Large scale distributed computing is a necessity for many companies nowadays. The huge data centers of these companies have to process large amounts of data, but also need to deal with constantly failing nodes. Fortunately, dealing with failing and joining nodes is a problem solved ages ago. However, what if you would push this flexible architectural paradigm of joining and leaving to the limit? In this project the student is asked to create a system JavaScript based distributed computing system that runs in the browser. The separate browser (and NodeJS) clients should be managed through one, or a series of master nodes and should be able to make computations for them in a reliable and secure way. The student will implement this system and demonstrate the workings with a simple task that is distributed over a network of failing and joining nodes. Contact: Frank Blaauw.
  • Less-intrusive Context-Aware System. In this project, we are interested in collecting precise and fine-grain human context in a building. The context could be users’ -location, -occupancy, or -activity. Such information is required by, for example, smart or intelligent building, as a foundation to decide an action to achieve predetermined goals, such as maximizing energy saving while less compromising with user comfort. Off-the-shelf sensors that are less intrusive are preferable to work with. Some topics are available such as (but are not limited to): 1) the exploration of sensor features; 2) sensor fusion; 3) user behavior or device utilization pattern recognition. Furthermore, how the proposed system would be scalable and portable is also interesting to be explored more. Contact: Azkario or Alexander.
  • Full-stack developer position for the development of a mental health platform. We are looking for a full-stack developer to start-up a four-year research project at the department of developmental psychology (RUG). This research project will examine the dynamics of well-being and psychological distress in children and adolescents using both cross-sectional and diary assessments. The first phase of this project will focus on the perspectives of the parents while the second phase will focus on the perspectives of the children and adolescents themselves. The master student involved in this project will be responsible for creating a user-friendly, visually attractive, and scalable Web platform used in the first phase of the project (i.e., focusing on the parents). Via this platform we wish to (1) implement the relevant questionnaires to parents and (2) provide personalized feedback. The technologies we use include (but are not limited to): Ruby on Rails, ReactJS, and Service Oriented Architectures. The master student will collaborate with four researchers from developmental psychology and computer science. The project will start at the end of January/ beginning of February. Contact: Frank or Ando.
  • Energy future decision tool. The Netherlands are committed to reducing their greenhouse gas emissions by at least 80% by 2050, relative to 1990 levels. This demands a transition to more sustainable ways of generating and using energy. There are many different pathways how this can be done and the Dutch society will have to decide upon in the coming years. To give people a better understanding of the required decisions and to help us learn about how people decide on the energy future and which information they take into consideration, you will develop a tool (similar to the 2050 calculator tool) which simulates the Dutch energy transition and exposes people to the technology choices and the necessary tradeoffs. Contact: Alexander or Nadja Contzen.
  • Energy-efficient Data Centers Models. Decreasing energy consumption in data center is a very important topic nowadays. This MSc project will focus on translating key aspects of data center operation to workable data center models. The project features a collaboration with Target Holding/CIT, who manage the university data center. In this project you will discuss with data center operators to identify operational processes and key parameters, and then translate those into tools that can be used for predicting and modeling data center behavior. As such this is a unique opportunity to get a look behind the scenes of data center operation. For this project you will cooperate with the SMS-ENTEG group. More info: (pdf) Contact: Alexander or Tobias van Damme.
  • Sustainable Data Centers. In the context of a regional project in collaboration with KPN and an international project with Cognizant India, the research aims at studying techniques to save energy in modern data centres. Internet of things and machine learning are central to the approach. In particular, the project will involve one or more of the following items: *) environmental model of data center for steering/controlling energy consumption (preferably generalisable); *) energy consumption model of a data center and its components; *) report containing recommendations for reducing carbon-dioxide footprints of datacenter; *) adaptive planning and scheduling techniques to save energy in data centres. Contact: Alexander or Wico Mulder.
  • Optimization of integrated energy flow. The objective of the MERGE project is to study and develop an energy management system to promote the integration of different energy systems, mainly electricity and gas natural, at the level of the distribution network. The physical properties of energy carriers and the complexity of different infrastructures have to be taken into account while investigating the dynamic behaviour of the integrated grid. The proposed MSc project is about developing a program to optimize the flow of gas and electricity over an integrated distribution network, described by non-linear functions. The stages of the project include modeling the gas-electricity flow as a non-linear system, presenting an overview and comparison of available toolbox and libraries (for example, MATLAB toolboxes, C++ or Java libraries), developing of a program to optimize electricity and gas flows on scalable networks. Contact: Laura.
  • Distributed Discrete Optimisation. Constraint satisfaction problems are a type of search problem with a broad range of applications, including planning, scheduling and resource allocation. Solving these problems with respect to a certain objective function allows optimisation of that particular problem, for example, optimising the energy consumption of a building. Unfortunately, this problem is NP-hard, meaning that algorithms that are guaranteed to find the optimal solution to a constraint satisfaction problem require exponential time to do so. Consequently, the size that algorithms can handle is limited (e.g. constructing a CSP to model an entire building would be impossible). When dealing with dynamic environments, the problem also has to be solved continuously and possibly in real time, requiring a solution to be available within a limited amount of time. Constraint networks of real-world problems are often sparse, however, and if the problem domain exhibits inherent locality, large-scale problems can be solved more efficiently by exploiting these structures (e.g. processes within a building are often mostly localised within a single room or area). This relative independence also facilitates parallelism in the search process, allowing a distributed cluster of machines to solve the problem faster and enables scaling with respect to the problem size. Many projects related to this topic are available, such as realising a more efficient distributed search algorithm, dealing with dynamicity within the environment by continuously solving the problem, increasing the level of parallelism of the algorithm and more. Contact: Michel Medema.

External Projects

  • [TAKEN/IN-PROGRESS] Machine learning for conversation classification. Messor is a company in Drachten specialized in sales support in a broad sense. They do lead generation, training, coaching and database enrichment. Messor creates and provides marketing campaigns for their clients who act in the domains of sales and marketing. Mostly B2B, but also, and more and more B2C. Clients of Messor are Samsung, Kaercher, Ricoh, Prescan, Tiktak / Segafredo, De Friesland Health insurer, SC Heerenveen, Friesland Lease etcetera. Messor is currently developing in-house a tooling that (telephone) conversations can qualify and classify. This includes the use of Artificial Intelligence. Messor is looking for students who will and could play a role in this software development and / or in the testing (parts of) the software and / or implementation. Contact: Frank Blaauw.
  • Model development when the data may not be shared (MSc). Big Data and AI are becoming a bigger and bigger influence on our daily life. People and companies become increasingly aware of the potential of their data and the impact on losing control on who is using their data. Therefore, companies are no longer willing to share their (private, business critical) data. Traditionally, a company with data would send their data to another company that is developing an analysis model (for example, a Machine Learning model). TNO is investigating the possibilities of developing models in an environment where data is not allowed to be freely transported. One of the solutions is to no longer bring the data to the analysis model (D2A), but to bring the analysis model to the data (A2D). This master student assignment is about investigating and building a prototype of an approach to be able to develop analysis models in an A2D manner. Contact: Elena Lazovik or Toon Albers
  • Dynamic on-the-fly switching between data sources for distributed Big Data analysis (MSc). Big Data and Data Science (AI & ML) are increasingly popular topics because of the advantages they can bring to companies. The data analysis is often done in long-running processes or even with an always-online streaming process. In both of these cases the data is retrieved from one or more data sources, analyzed or transformed, which results in output to a data "sink", such as another database or a message queue. At TNO we are looking into ways to update such long running analysis processes in runtime, and part of that is updating the data sources: The longer a data analysis process is running, the more likely it is that new sources of data are introduced (think of user behavior data from a newly created part of a website, or sensor data from a new data provider) or that outdated data sources must be switched to newly created sources (think of switching from SQL to NoSQL). Your challenge is to develop a technical library that would support the switching of both streaming and historical data sources for distributed analysis platforms in runtime (for example Apache Spark). Knowledge of distributed systems (through the Distributed Systems, Scalable Computing and Web & Cloud Computing courses) is key, and we are looking for people that enjoy both research as well as the actual development of software. TNO provides a physical cluster to run your experiments on. Internship compensation is also provided. Contact: Elena Lazovik or Toon Albers
  • Runtime validation of software against constraints from context (MSc). Big Data and Data Science (AI & ML) are increasingly popular topics because of the advantages they can bring to companies. The data analysis is often done in long-running processes or even with an always-online streaming process. This data analysis is almost always done within different types of limitations: from users, business perspective, from hardware and from the platforms on which the data analysis is running. At TNO we are looking into ways of verifying whether a running distributed analysis meets these limitations and requirements. We have some experience in working with constraints for IT systems. Your challenge would be to investigate and experiment on capturing the different kinds of constraints that can be defined on a system, and to develop a solution that can validate a running data analysis against these constraints. The validation against given constraints should happen in runtime when it is needed (for example, when new constraints are added). Knowledge of distributed systems (through the Scalable Computing course) and good understanding of mathematics/logic are key, and we are looking for people that enjoy both research as well as the actual development of software. TNO provides a physical cluster to run your experiments on. An internship compensation is also provided. Contact: Elena Lazovik or Toon Albers
  • Measures of Social Behaviour. Questionnaires are sensitive to several sources of noise. And above all, the moment-by-moment quantification of behaviour is impossible while using questionnaires. To manoeuvre away from these deficiencies we have developed a passive monitoring system that is based on the ubiquity smartphone technology. Due to the advances in technology, the World Economic Forum announced in February 2016, that the world is entering its Fourth Industrial Revolution based on hyper-connectivity, data-driven solutions and artificial intelligence (World Economic Forum, 2016). hyper-connectivity is characterised by a state of being constantly connected to individuals and machines through devices such as smartphones. hyper-connectivity and large-scale data collection through smartphones are the fundamental elements of new technological initiatives in healthcare and biomedical-research. These smartphone-based technological initiatives are largely due to the fact that the number of sensors embedded in smartphones have exploded over the past few years. Nowadays the majority of smartphones are equipped with sensors such as a GPS, accelerometer, gyroscope, WIFI, bluetooth, camera and microphone. These smartphones aggregate a large amount of user related data which are in the context of research largely untouched. Our ambition is to develop several objective measures of social behaviour by using the data collected through our passive monitoring application. The objective quantification of social behaviour is important since the majority of psychiatric disorders affect social behaviour. In the context of a master thesis, we would like a master student with good knowledge of R to develop several of these measures that are related to social behaviour and test these measures on data of psychiatric patients. Contact: Niels Jongs
  • Passive Behavioural Monitoring (MSc). Advances in low power communication technologies and large scale data processing continue to give rise to the concept of mobile healthcare systems as an integral part of clinical care/research processes. This project will focus on the data that is collected by a passive behavioural monitoring system in which personal mobile devices are used as a measuring instrument. The data mainly consists of sensor and activity data which might allow us to differentiate between healthy and non-healthy individuals. In this project, our aim is to establish behavioural profiles which are related to neuropsychiatric disorders by using advanced data analysis and data mining techniques. These behavioural profiles are derived from the sensor and activity data collected from a passive behavioural monitoring system and are used to predict the onset or relapse of neuropsychiatric disorders. Additionally, our aim is translate these behavioural profiles to animal behavioural models of which the data is collected in a controlled lab environment. Contact: Martrien Kas.
  • Flexible computing infrastructures (proposed by TNO Groningen). More information: pdf. Contact: Alexander or TNO directly (contact details in the PDF).
  • Privacy-friendly context-aware services (proposed by TNO Groningen). More information: pdf. Contact: Alexander or TNO directly (contact details in the PDF).
  • Interaction with devices in a household for the purpose of enabling smart grid services (proposed by TNO Groningen). More information: pdf. Contact: Alexander or TNO directly (contact details in the PDF).