RESOURCES LIBRARY

SCIENTIFIC PUBLICATIONS

Find all the scientific publications produced by NebulOuS partners, presenting the latest scientific findings of the project.

Authors
Verginadis, Yiannis; Sarros, Christos-Alexandros; Reyes de Los Mozos, Mario; Veloudis, Simeon; Piliszek,Radosław; Kourtellis, Nicolas; Horn, Geir

Abstract
Cloud Continuum is the paradigm that unifies and exploits resources from far edge to public and private cloud offerings, as well as processing nodes with significant capacity in between. Nowadays, the combination of all these resources for augmenting modern hyper-distributed applications becomes a necessity, especially considering the vast volumes of data, their velocity, and their variety, which constitute well known challenges of Big Data processing. In this paper, we address the main research question on how a Cloud Continuum management platform should be structured to cope with the constantly increasing challenges and opportunities of the domain. We introduce the NebulOuS architecture vision towards accomplishing substantial research contributions in the realms of Cloud Continuum brokerage. We propose an advanced architecture that enables secure and optimal application provisioning, as well as reconfiguration over the Cloud Continuum. NebulOuS introduces a novel MetaOperating System and platform, that is currently being developed, for enabling transient Cloud Continuum brokerage ecosystems that seamlessly exploit edge and fog nodes, in conjunction with multi-cloud resources, to cope with the requirements posed by low latency applications.

Download here

Authors
Marta Różańska; Geir Horn

Abstract
Autonomic decisions are necessary for persistent Cloud application adaptation due to dynamic execution context and changing workload. However, it is often difficult to accurately model the utility to represent the application owner’s preferences as a mathematical function linking the utility with the monitored information from the application. We propose a systematic approach for utility function modelling for adaptive applications in the Cloud continuum. This method exploits a set of utility function templates, automated quality checks, and the impact of measurements and performance indicators values on the utility value range. The method is evaluated with an illustrative example of a Cloud application that is compared to an existing manually modelled utility function.

Download here

Authors
Andreas Tsagkaropoulos; Yiannis Verginadis; Gregoris Mentzas

Abstract
Cloud services and applications become ever more important for enterprises, which profit from the advantages of scalability, flexibility and the pay-as-you-go model which are offered by Cloud service vendors. One of the most well-known standards in the domain, which have been developed about ten years ago, is the TOSCA cloud application specification. TOSCA allows the definition of the structure and operation of cloud applications. Although considerable work has been done before in the specification of monitoring and elasticity – of which a thorough analysis is provided – its quality and its integration in TOSCA can be significantly improved. In this work we suggest specific extensions covering the monitoring of processing components and the elasticity policies which are associated with them. Indicative TOSCA examples are provided to aid comprehension.

Download here

Authors
Moritz von Stietencron; Amir Azimian; Jan-Frederik Uhlenkamp; Johannes Gust; Karl Hribernik

Consult here 

Authors
Andreas Tsagkaropoulos; Yiannis Verginadis; Gregoris Mentzas

Abstract
As cloud computing continuum services become ever more important, the need of platforms that facilitate and manage their proper operation is unquestionable. An integral part of these platforms is a series of tools which will complement day one and day two operations in the context of the application software lifecycle. This work introduces an approach towards a Service Level Objective (SLO) Violation Detection system, based on the perceived Severity of an imminent or predicted violation. This system leverages insights to stay operational and triggers appropriate reconfiguration actions by continuously considering the required conditions of good operation. The detailed architecture of the system, its operation overview as well as the required interactions with other components – parts of an adaptation ecosystem of a cloud platform, are provided. Finally, potential future improvements are discussed. Keywords—application reconfiguration, modelling, elasticity, cloud computing, service adaptation

Download here

Authors
Gregory KoronakosDimitris ApostolouYiannis VerginadisAndreas TsagkaropoulosIoannis PatiniotakisGregoris Mentzas

Abstract
The proliferation of Internet of Things (IoT) and advancements in cloud and fog computing have catalyzed the development of the Cloud Continuum, an enhanced cloud system integrating diverse computational resources and services. This paper introduces the Cloud Fog Service Broker (CFSB) designed to facilitate the evaluation and selection of these resources within the cloud continuum ecosystem. The CFSB employs Multiple-Criteria Decision Making (MCDM) methods that rely on Mathematical Programming to assess resources based on multiple factors. This integration allows for a comprehensive assessment, ranking resources by synthesizing these diverse criteria and incorporating user preferences through weight restrictions in optimization models. The architecture, criteria, evaluation process and an illustrative example of the CFSB’s application highlight its role in decision-making for cloud resource selection.

Download here

Authors
Vasilis-Angelos Stefanidis, Yiannis Verginadis, Gregoris Mentzas

Abstract
Cloud computing has been an ordinary and widely used method for processing data that is produced at the edge in a centralized manner. Nowadays, there is consent on the value of altering this traditional approach and trying to offload processing tasks as close to the edge as possible. For this reason, the distributed machine learning method, the so-called Federated Learning (FL) is introduced with emphasis on Multi Cloud Edge environments. These kinds of distributed architectures are valuable assets in terms of efficient resource management while predictions can be used for further enhancing this efficiency by introducing proactive adaptation capabilities for the distributed applications. In this paper, we focus on deep learning local loss functions in multi-cloud environments and apply a federated learning distributed algorithm that enhances the forecasting accuracy. This enhancement is introduced by applying Inference in resources constrained edge environments, with Tiny Machine Learning methods. Additionally, two methods are also applied during the distributed training process that enhance the FL approach and increase the prediction accuracy in application and resources monitoring metrics. We evaluate our approach by presenting experimental results using various datasets that concern infrastructure resources consumptions. These results demonstrate in most of the cases a significant increase in the prediction accuracy of resource consumption values of time-series data while reducing the resources (i.e., disk and execution time) requirements during the machine learning training process. All the above facts confirm the benefits of our innovative approach.

Download here

Authors
Geir Horn, Marta Różańska, Rudolf Schlatte

Abstract

Distributed applications running in the computing continuum operate in volatile environments where the execution context may change considerably over time. It takes significant effort to manually manage and maintain good performance of such applications. Autonomic computing allows the application to optimize whenever the execution context changes, which means maximizing its utility over and over again in response to monitoring information. The utility is maximized by changing the resource requirement parameters of the application’s components, and finding the optimal assignments of discrete variables takes time. This paper proposes a more efficient way to optimize the application’s utility by using the resources provided by the possible sites to host the application’s components directly in the utility maximization. The approach is evaluated positively for a realistic application, and it is currently being implemented in the NebulOuS platform acting as a meta operating system supporting applications deployed in the continuum.

Download here

Authors
Paula Cecilia Fritzsche, Aleix Vila Cano, Guillermo Raya Garcia, Ashneet Khandpur Singh, Mario Reyes de los Mozos

Abstract
This study introduces a novel lightweight machine learning (ML) approach for detecting resource utilization anomalies within Kubernetes (K8s) clusters, developed as part of the NebulOuS meta-operating system and platform. The proposed solution integrates Netdata for data collection and combines a traditional algorithm with an immunological algorithm for real-time monitoring. This enables the detection of malicious or intrusive behaviors, adversarial procedures, and resource- related anomalies such as CPU or RAM overuse. The solution adheres to the principles of the cloud-edge continuum, facilitating seamless interaction between cloud and edge resources to enhance computational efficiency and response times. By deploying a detection container, it ensures the security and efficiency of applications running on K8s clusters at the edge. Given NebulOuS’ dynamic resource allocation and transient networking of interconnected, interoperable, and heterogeneous cloud infrastructures, incorporating AI-driven security measures is critical for strengthening K8s environments against evolving threats. This integration marks a significant advancement toward the vision of utility computing, where resource provisioning and anomaly detection are automated, adaptive, and resilient.

Download here

Authors
Geir Horn, Yiannis Verginadis, Giannis Ledakis, Nikos Papageorgopoulos, Simeon Veloudis 

Abstract
The massive proliferation of Internet of Thing (IoT) devices in recent years has created demand for distributed computing architectures capable of processing data and making decisions closer to the network edge. This paper proposes a meta-operating system architecture that manages distributed applications and heterogeneous resources in a unified manner, akin to how traditional operating systems manage local resources. The proposed architecture emphasises efficient allocation and dynamic scaling of computation capacity across the Cloud Continuum, as well as minimization of data transfer to meet application requirements, whilst adhering to privacy and security constraints. Key architectural building blocks, quality assurance, and application control loops are explored. This work advances the design of distributed platforms to address the scientific challenges of computing, caching, and communication in the Cloud Continuum.

Download here

Authors
Francisco Alvarez, Ferran Diego, Marta Rozansk, Robert SanfeliuYiannis Verginadis, Geir Horn

Abstract
The management of workflows in distributed computing environments, such as cloud and edge infrastructures, presents unique challenges due to the dynamic nature of workloads and resource availability. This paper introduces RemWoW, a system designed to efficiently manage repeated workflows by optimizing resource allocation and task scheduling. RemWoW leverages a sophisticated architecture that integrates edge and fog nodes with multi-cloud resources to meet the demands of low-latency workloads. By focusing on workflow allocation and resource provisioning, RemWoW aims to minimize both execution time and resource costs, ensuring that workflows are executed efficiently in response to external events. The system’s architecture and its application in managing repeated workflows are demonstrated, highlighting its potential for enhancing workflow management in modern distributed computing environments.

Download here

JOURNAL PAPERS

Find all the journal papers produced by NebulOuS partners

Authors
Marta Różańska, Geir Horn 

Abstract
Cloud applications are built from a set of components often deployed as containers, which can be deployed individually on separate Virtual Machines (VMs) or grouped on a smaller set of VMs. Additionally, the application owner may have inhibition constraints regarding the co-location of components. Finding the best way to deploy an application means finding the best groups of components and the best VMs, and it is not trivial because of the complexity coming from the number of possible options. The problem can be mapped onto may known combinatorial problems as binpacking and knapsack formulations. However, these approaches often assume homogeneus resources and fail to incorporate the inhibition constraints. The main contribution of this paper are firstly a novel formulation of the grouping problem as constrained Coalition Structure Generation (CSG) problem, including the specification of the value function which fulfills the criteria of a Characteristic Function Game (CFG). The CSG problem aims to determine stable and disjoint groups of players collaborating to optimize the joint outcome of the game, and a CFG is a common representation of a CSG, where each group is assigned a value and where the value of the game is the sum of the groups’ contributions. Secondly, the Integer-Partition (IP) CSG algorithm has been modified and extended to handle constraints. The proposed approach is evaluated with the extended IP algorithm, and a novel exhaustive search algorithm establishing the optimum grouping for comparison. The evaluation shows that our approach with the modified algorithm evaluates on average significantly less combinations than the CSG state-of-the-art algorithm. The proposed approach is promising for optimized constrained Cloud application management as the modified IP algorithm can optimally solve constrained grouping problems of attainable sizes.

Download here