IPCEI-CIS Reference Architecture

A comprehensive overview of its layers and components. Click on each block and learn more!

Management

Logging
Monitoring & Alerting
Accounting & Charging
Performance Management
Fault Management
Catalog/ Repository
Ops automation (XCps)

Security and Compliance

Identity & Access Management
Identity Management
Key Management Service
Audit log
Account Management
Service Compliance Verification

Sustainability

Benchmark, Metrics and Monitoring
Energy Consumption and Carbon Emission Optimization
Sustainable workload placement and scheduling
Renewable Energy Management
Cooling and Heat Management

Federation Manager

Other domain components...

FEDERATION

Application

Data

Service Orchestration

Cloud Edge Platform

Virtualization

Network Systems, SDN controllers

Physical Cloud Edge Resources

Physical Network Resources

Application Layer

Application

App Designer App Packager API Gateway App Monitoring App Accounting & Charging App Catalog Other domain components...

It provides the functionality required to design and develop applications and functions and to expose them for use. It includes end-user applications and function catalogs.

Application Designer

This component enables developers to design, create and customize applications using intuitive interfaces or predefined templates. It facilitates rapid development, integration and delivery of tailored applications using automated CI/CD practices (DevOps). The Application Designer facilitates the description of the application in terms of: set of application components it is made of and how they are connected (service function chain), runtime environment each of the application components will require, including the set of functions/services to support its execution, the attributes that may allow the selection of the computing node to host it (hardware requirements, latency, privacy, etc. ).

Application Packager

The Application Packager supports the packaging of applications for its deployment in the Cloud-Edge continuum. It facilitates the automation of application deployment and update (DevOps, both traditional and AI-assisted), providing an integrated toolkit that enables quick, secure and innovative ways to deploy cloud-aware applications. It also provides tools for automatic verification and validation (CV/CT) of the application and its supply chain before its final packaging.

API Gateway

This component provides the interface to invoke and use the applications contained in the catalog. It checks the identity and authenticates the user and checks his authorization to use the application before providing access to it.

Application Monitoring

It tracks application usage and execution, monitors the performance and identifies abnormal behavior and suboptimal use of resources.

Application Catalog

It implements a directory of applications and functions that the providers have made available. They contain the characteristics of the application and the environment it requires for its execution (runtime, services, hardware characteristics).

Application Accounting and Billing

This component implements the accounting of application usage and provides online charging information for the customer to track application expenditure in real-time.

Data Layer

Data

Data Pipelines Data Modelling Data Exposure Data Federation Data Catalog Data Policy Control Other domain components...

This layer offers simple, scalable, sustainable and trustable mechanisms for collection, processing and exchange of data distributed over the cloud-edge continuum.

Data Pipelines

This component provides the functionality for data collection, including the connectors to integrate with the data sources and the capabilities for data curation and pre-processing that ensure its quality and readiness for analytics, insight generation, training, modelling or inferencing phases.

Data Modelling

This component enables data cataloguing to enable exposure and discovery at scale to easily search, find and browse data, over a distributed environment.

Data Exposure

The Data Exposure component provides customers with standard mechanisms and interfaces for safe and controlled access to data. It includes capabilities for making data offers and contract data acquisition, identity checking and data access authentication and authorization.

Data Policy Control

Data Policy Control sets the required policies for data sharing, providing a safe, controlled and regulation-compliant environment for data exchange. It allows the data owner to manage the permissions to access its data: who can make it, at which conditions and for which purposes.

Data Catalog

Data Catalog provides efficient storage and indexing of data to facilitate browsing, searching and finding data over a distributed environment.

Data Federation

Data Federation enables standard mechanisms and interfaces (connectors) for partnering in the provision of datasets, providing a unified view of data catalogs and databases from multiple data providers. This component enables real-time data exchange across companies using data mesh principles, connecting distributed and heterogeneous actors over the cloud-edge continuum, keeping data owners in full control of their data. In order to create and maintain a coherent federated multi-provider Cloud Edge Continuum, Data Federation capabilities should be designed consistently with the other federation capabilities described in this document.

AI Layer

Cloud-Edge Training Cloud-Edge Inference Cloud-Edge Agent Mgt Federated Learning AI Model Catalog AI Explainability Other domain components...

The AI layer provides a set of advanced functionalities to apply Artificial Intelligence (AI) on distributed sets of data and processing resources complementing the centralized cloud-based solutions. This layer facilitates the seamless integration of AI model lifecycle management into the cloud-edge continuum, enabling efficient operations across distributed environments. Its capabilities are tailored to the specific characteristics of the cloud-edge continuum, ensuring optimal performance and resource utilization. It provides:

end-to-end solutions for the AI lifecycle, including preprocessing, model training, and evaluation,
inference with deployment, monitoring, operations across cloud and edge, and
agents with tools and workflows for coordinated AI execution

Cloud-Edge Training

This component facilitates the dynamic and adjustable training of AI models across cloud and edge environments, ensuring scalability, reduced latency, and optimized resource utilization.

Cloud-Edge Inference

The Inference components facilitate real-time deployment and execution of trained AI models on edge devices with efficient synchronization with the cloud for updates, monitoring, and enhancements.

Cloud-Edge Agent Manager

The Cloud-Edge Agent Manager enables the deployment and management of agents and agentic workflows on edge and hybrid edge-cloud deployments creating an agentic mesh.

AI Model Catalog

This component contains trained foundational models: LLMs, SLMs, multimodal LLMs, in multiple languages and managing multiple data types: text, images, video, code, etc. These models provide support for Natural Language Processing (NLP), Machine Translation (MT), speech processing, text analysis, information extraction, summarization or text and speech generation. They can be fine-tuned and adapted to specific use cases, using techniques like RAG, model quantization, pruning or distillation. The catalog contains multilingual and multimodal LLMs tailored to diverse EU languages, capable of understanding and processing diverse data types, including text, images, and multimedia. These models address the scarcity of generative AI solutions in non-English languages, ensuring semantic precision, completeness, and compliance with the AI Act.

Federated Learning

AI workloads can be split across multiple nodes with central orchestration for scalability and efficiency (Distributed AI). AI Federation enables autonomous nodes to collaborate securely, ensuring privacy and sovereignty. Together, they balance task-sharing efficiency with autonomy. In distributed AI training, the AI model is generated at a central point based on the combination of models produced by different training agents distributed across a ecosystem of federated AI service providers or owners. The distributed training agents work locally on local datasets reducing the need to transfer data to a central location for training. This component allows to use and orchestrate AI resources across multiple providers to collaboratively perform a specific machine learning training task. It leverages a federated network of AI capabilities geographically distributed across the multi-provider Cloud Edge Continuum, enabling seamless resource sharing and scaling while maintaining sovereignty and compliance. It ensures efficient distribution of AI computational workloads, minimizes data movement, and facilitates parallel model training without requiring centralized data aggregation, thus preserving data privacy and autonomy while enhancing overall system performance. In order to create and maintain a coherent federated multi-provider Cloud Edge Continuum, Federated Learning capabilities should be designed consistently with the other federation capabilities described in this document.

AI Explainability

This Explainable AI component ensures transparency by providing interpretable insights into AI decision-making processes. It supports compliance, accountability, and trust by enabling users and regulators to understand, audit, and validate AI models while respecting privacy and data sovereignty.

Service Orchestration

Service Orchestrator Service Federation App Performance Management App Repository Other domain components...

The Service Orchestration refers to the integration and management of multiple cloud and edge services and applications in a unified and automated way across diverse multi-provider environments. In the context of multi-cloud and edge ecosystems, service orchestration is essential for handling the complexities of deploying and managing applications at scale, especially when dealing with distributed resources across cloud and edge infrastructures.

Service Orchestrator

Service orchestration assures efficient tasks execution, load balancing and real-time operations. For example, it could communicate with the Multi-Cloud Orchestrator that manages the virtualized infrastructure layer offering a single unified environment for application development and monitoring. This allows applications and services to be deployed seamlessly across multiple platforms, optimizing resource allocation and reducing operational complexity. Alternatively, the Service Orchestrator may directly or indirectly interact with the underlying capabilities of the cloud platform or virtualization management layer to orchestrate workload execution.

The Service Orchestrator automates application and tenant deployment, and lifecycle management processes. By automating workflows (or service function chains), orchestration ensures that services communicate efficiently across the cloud-edge continuum.

Application Performance Management

It monitors the performance and resource consumption of the application or service and communicates deviations from set thresholds or SLAs to the Service Orchestrator for this to take actions to recover a state that meets application requirements. It provides a unified view of states, including logging, monitoring, and alerting, for effective real-time application management and validation at runtime.

Application Repository

This component tracks the applications and services that have been deployed and their configuration, the locations where the application and service components are installed and the resources they are consuming.

Service Federation

This component interconnects the Service Orchestrator with those of other federated providers, enabling the deployment and execution of applications (service function chains) across multiple providers in a seamless way, interacting with a single provider. In order to create and maintain a coherent federated multi-provider Cloud Edge Continuum, Service Federation capabilities should be designed consistently with the other federation capabilities described in this document.

Cloud Edge Platform

MultiCloud Orchestrator(PaaS) Physical Infrastructure Manager Virtual Infrastructure Platform MultiCluster Manager Workload Deploy Manager Serverless Orchestrator Cloud Edge Connectivity Mgr Cloud Edge Federation Cloud Edge Access Control Cloud Edge Resource Repository Workload Inventory Other domain components...

The Cloud Edge Platform layer hosts components that manage and orchestrate runtime environments (virtual machines, containers, serverless) and platform resources (libraries, functions, services, security, databases, tools). The platform services (PaaS) it provides support the deployment of applications and the corresponding runtime environments and platform resources.

Multi-Cloud Orchestrator (PaaS)

Multi-Cloud Orchestrator delivers a Platform as a Service (PaaS) service. A PaaS provides a complete application development and deployment environment in the cloud. With PaaS, customers can build, test, deploy, manage, and update applications quickly and efficiently, without worrying about the underlying infrastructure. It receives from the Service Orchestrator a request to deploy (or manage the lifecycle of) a certain application together with a descriptor (resource model) that defines the state the application needs for its execution (including runtime environment, services, data, application image and other attributes like area of service, performance...). MCO processes the state and takes actions to set it up and preserve it, by updating, upgrading or removing workloads and services, or rescaling or releasing resources. MCO works in close relationship with other components (PIM, VIP, MCM, Serverless orchestrator) to provide the virtual runtime environment defined for the application, the specific combination of bare metal, virtual machine, containers, serverless mechanisms it has been developed to run on, using the technologies over which it has been tested and certified. MCO also deploys and manages the lifecycle of essential tools and services such as middleware, development frameworks, databases, and business analytics, enabling organizations to streamline application development and drive innovation. A PaaS, managed by the MCO, offers scalability, high availability, and reduced time-to-market, allowing developers to focus on coding and application functionality while the MCO supports with infrastructure, security, and operational aspects. Based on certain attributes, like area of service and performance, the MCO may select the location(s) where to deploy the workload and the resources (physical and virtual) required at those location(s) to meet the desired state. This decision on application placement can also follow sustainability and privacy requirements. The MCO deploys the workload once the necessary resources are available, using the Workload Deployment Manager. The MCO also updates and removes workloads, rescaling or releasing the corresponding resources. This MCO description shows a decomposition of the functionality of a cloud-edge continuum workload management solution that may be implemented in many ways, combining or excluding some of its components in order to fit specific sector needs.

Cloud Edge Connectivity Manager

The Cloud Edge Connectivity Manager (CEC) implements and modifies the service function chain, or removes it, totally or partially, following the requests from a Service Orchestrator, to guarantee the connectivity between workloads that will enable the service delivery and the connectivity from the service user to the workloads implementing the service front-end. Connectivity is usually based on overlay and underlay components in each domain crossed by the traffic (e.g. WAN, data centers, etc.). The CEC manages the networking in the data center domain through the virtualization managers (VIM, CISM) or via specific NaaS interfaces. It manages the WAN connectivity using Cloud Networking services (via transport SDN Controllers) for the connection of different computing nodes. In addition, the CEC manages the complexity deriving from the need to ensure consistency between overlay and underlay networking solutions (for example adapting the networking between the data center fabric and the WAN connectivity).

Physical Infrastructure Manager

The Physical Infrastructure Manager (PIM) monitors and manages a pool of physical resources (CPUs, storage, networking), and selects and prepares them (with the corresponding OS and necessary software) to allocate these resources to a virtual machine or container cluster. The PIM provides multiple physical infrastructure management functions, including physical resource provisioning and lifecycle management, physical resource inventory management or physical resource performance management.

Multi-Cluster Manager

The Multi-Cluster Manager (MCM) creates and configures container clusters both over bare metal and over virtual machines after a request from the MCO, offering a single interface to manage infrastructure from multiple providers and with multiple K8s distributions. The MCM provides open connectors/APIs to interact with the resources and k8s distributions offered by different providers (private & public) for cluster creation, configuration and monitoring, and keeps track of their evolution. The MCM may create a K8s cluster on bare metal (cluster nodes are servers) or on virtualization stack (cluster nodes are VMs), interacting with PIM or VIP respectively.

Virtual Infrastructure Platform Manager

The Virtual Infrastructure Platform Manager (VIP) creates virtual machine clusters across several locations using the resources allocated by the PIM. The VIP is required when the service component to be deployed is a Virtualized Application or a Containerized Application, which runs over container clusters that make use of VMs (virtual machines). This component works on infrastructure and technology from different providers, enabling the Cloud Edge continuum to run on a diverse set of different virtualization solutions (VIMs, CISMs or any other future virtualization technology).

Workload Deployment Manager

The Workload Deployment Manager (WDM) deploys software package(s) on top of an existing cluster(s) following MCO requests. It exposes a single interface to deploy software packages (i.e. via a helm chart or resource model declaration) on any K8s cluster (or alike) based on any distribution. The WDM provides the connectors/APIs to interact with existing clusters in different locations & technologies (K8s distributions) for application deployment and lifecycle management. This component can also deploy software packages directly on virtual machines (IaaS).

Cloud Edge Federation

This component interconnects the Multi-Cloud Orchestrator with the ones of other federated providers, enabling the customer the use of cloud edge computing services (IaaS, CaaS, PaaS, Serverless, NaaS...) across multiple providers in a seamless way, interacting with a single provider. This platform federation provides seamless integration and collaboration between multiple cloud platform providers, enabling interoperability, resource sharing, and unified lifecycle management. Shared resources may exist on all layers of cloud architecture. By adopting standardized protocols and interfaces, platform federation facilitates enhanced scalability, efficiency, and innovation across different cloud environments while maintaining autonomy and security for each participating entity. In order to create and maintain a coherent federated multi-provider Cloud Edge Continuum, Cloud Edge Federation capabilities should be designed consistently with the other federation capabilities described in this document.

Cloud Edge Access Control

This component implements a key aspect in terms of security in the management of cloud edge infrastructure, a role-based access control that ensures proper access rights and security across the infrastructure.

Cloud Edge Resource Repository

This component keeps a record of the resources available in each of the edge locations, the virtualization platforms available and the configuration. The information in this repository helps the multi-cloud orchestrator to select the right location(s) to deploy workloads.

Workload Inventory

This component keeps record of the workloads that have been deployed and their configuration, as well as information about the location and cluster where they have been deployed and the resources they are consuming.

Serverless Orchestrator (FaaS)

The Serverless Orchestrator provides Serverless capabilities, also known as Function as a Service (FaaS), is a cloud computing model that allows developers to build and deploy applications in the form of individual functions, which are executed in response to specific events or triggers. This model eliminates the need to manage server infrastructure, enabling developers to focus solely on writing code. Each function runs in a stateless container, automatically scaling with demand and only consuming resources when invoked, leading to cost savings and efficient resource utilization.

Virtualization

Hardware Resource Manager (BMaaS) Virtual Infrastructure Manager (IaaS) Container Infrastructure Service Manager (CaaS) Virtual Resource Access Control Virtual Resource Repository Other domain components...

The Virtualization Layer in cloud infrastructures acts as a crucial abstraction layer that enables the efficient utilization of physical resources by creating multiple virtual instances of servers, storage, and network resources. This layer leverages hypervisors or related technologies to decouple hardware from the operating system, allowing for the dynamic allocation and scaling of resources based on demand. By providing a flexible and scalable environment, the virtualization layer enhances resource optimization, simplifies management, and supports the seamless deployment of various cloud services and applications. This foundational component is essential for delivering Infrastructure as a Service (IaaS) and other cloud service models, ensuring agility, cost-effectiveness, and high availability.

Hardware Resource Manager (BMaaS)

The Hardware Resource Manager component delivers a Bare Metal as a Service (BMaaS) service. BMaaS is an abstraction that provides physical, non-virtualized hardware resources directly to users, offering dedicated servers, storage, and networking components without any virtualization layer. This service allows to harness the full power of the hardware for applications, resulting in higher performance, predictable latency, and complete control over the environment. BMaaS is particularly beneficial for workloads that require intensive computation, low-latency networking, or compliance with specific hardware configurations.

Virtual Infrastructure Manager (IaaS)

The Virtual Infrastructure Manager (VIM) component provides Infrastructure as a Service (IaaS) service. IaaS is a cloud computing model that provides virtualized computing resources. IaaS delivers essential services such as virtual machines, storage, and networks. Users can provision, scale, and manage the resources dynamically according to their needs, while the cloud provider takes care of maintaining the underlying hardware, networking, and security. This model offers high flexibility, enabling organizations to quickly deploy and run applications and services, test new solutions, and handle varying workloads with ease, ultimately driving innovation and operational efficiency.

Container Infrastructure Service Manager (CaaS)

The Container Infrastructure Service Manager (CISM) component provides a Container as a Service (CaaS) service. CaaS is a cloud service model that provides a platform allowing users to manage and deploy containerized applications and workloads. By leveraging container orchestration tools such as Kubernetes, CaaS facilitates the automation of container deployment, scaling, and operations, ensuring high availability and performance. This model abstracts the underlying infrastructure complexities, enabling developers and IT teams to focus on application and service development and deployment without worrying about the maintenance of the physical or virtual infrastructure.

Virtual Resource Access Control

As in the Cloud Edge Platform layer, this Access Control component implements virtual infrastructure management security, a role-based access control that ensures proper access rights and security for virtual resource management.

Virtual Resource Repository

This component keeps record of cloud edge sites and the configuration and availability of virtual resources in each one of them (for instance, number of k8s clusters available per site, CPU/memory available per k8s cluster, number of virtual CPUs that are available to setup new k8s clusters, ...) in order to help take decisions on workload placement.

Network Systems, SDN controllers

Dettagli sui Network Systems e SDN controllers...

Physical Cloud Edge Resources Layer

Physical Cloud Edge Resources

Compute Storage Networking Hardware Infrastructure Manager Hardware Resource Repository Other domain components...

Compute

Compute resources are fundamental to cloud infrastructure, delivering the computational power required for running applications and services. They facilitate scalable and efficient environments that dynamically adjust to varying workloads, thus enhancing resource utilization and performance while minimizing costs.

Storage

Storage is essential in cloud infrastructure, providing data persistence, management, and accessibility. It includes block storage for databases, object storage for unstructured data, and file storage for shared access applications. Advanced technologies like SSDs and distributed file systems ensure scalability, reliability, and performance.

Networking

Hardware networking resources in a cloud edge location include routers, switches, load balancers, and firewalls. These components form the backbone of data center connectivity and inter-server communication. Network Interface Cards (NICs) in servers enable high-throughput connections to the virtual network. WAN gateways and edge routers extend connectivity to external networks, supporting hybrid cloud and remote access scenarios. All hardware is managed centrally through SDN controllers and scaled dynamically to support edge cloud service demands.

Hardware Infrastructure Manager

A Hardware Infrastructure Manager (also known as Data Center Infrastructure Management system, DCIM4) is a management component designed to monitor, measure, and manage the IT equipment and infrastructure within a cloud edge data center. It encompasses the following key aspects: Monitoring and Management: it provides real-time monitoring of data center operations, including power usage, cooling efficiency, and physical security. This helps in optimizing the performance and efficiency of the data center. Documentation and Planning: it maintains detailed documentation of the data center's physical and virtual assets. This includes layout planning, capacity management, and future expansion plans. Risk Management: By continuously monitoring environmental conditions and equipment status, it helps in identifying potential risks and mitigating them before they lead to failures. Integration with IT Systems: It integrates with other IT management systems to provide a holistic view of the data center's operations, facilitating better decision-making and resource allocation. Sustainability and Compliance: It supports sustainability goals by optimizing energy usage and ensuring compliance with industry standards and regulations.

Hardware Resource Repository

This component keeps record of cloud edge locations and the configuration and availability of physical hardware resources in each one of them (for instance, number of servers per location, type of servers, type of NIC cards available per location, cost of resources, energy consumption of resources, ...) in order to help take decisions on workload placement and resource lifecycle management.

Physical Network Resources

Dettagli sulle Physical Network Resources...

Sustainability Domain

The sustainability components in the ICRA enable significant gains in energy efficiency by including advanced data processing functionality. Edge computing reduces the amount of data transferred between edge and cloud data facilities and thus reduces energy consumption stemming from data transmission activities. Each layer incorporates specific components that improve the overall sustainability of the next-generation of advanced cloud infrastructure and services by:

improving the transparency of energy consumption in data processing,
optimising energy efficiency based on decentralised data processing,
increasing usage of renewable energy and energy-efficient cooling solutions and
energy and resource-aware workload placement and management.

Benchmark, Metrics and Monitoring

It provides comparisons of energy consumption among application instances, services, sites or hardware elements to identify bad or good performers. It also compares energy consumption needs of applications/workloads with low-cost/renewable energy availability to help in identifying potential energy optimizations based on moving workload to places where energy is free or low-cost.

It tracks in real-time the energy consumption of workloads, hardware elements (servers, storage, networking), sites and services, and the energy generation of associated renewable energy sources. It also monitors the resource consumption per service or application, the waste and heat generation, and other potential environmental protection measures. This component is setting the standards for monitoring services required to support environmental targets and energy cost reduction as well as Financial Optimization of 8ra services and infrastructure These data are subject to be used by the Optimization component for forecasting, planning and decision making around energy and resource consumption. It identifies when applications, hardware elements, nodes or facilities exceed certain thresholds in terms of energy or resource consumption, triggering alerts. The Monitoring component also tracks the availability and cost of energy in energy generation locations, identifying those where free or low-cost energy is available. These are energy producers or consumers with excess energy capacity that otherwise would be wasted if not consumed, like wind turbines, solar power plants and data centers. The component also gives recommendations for data collection frameworks and tools.

Energy Consumption and carbon emission

Optimization. This component implements mechanisms to optimize resource and energy consumption and carbon emission, for instance, scaling down the resources allocated to a certain application when they are not required, and it is not foreseen they will be in the short term. It is in charge of implementing a “green” operational mode for cloud-edge nodes and platforms. It provides recommendations for optimization of application and infrastructure setup based on data from benchmark and monitoring. This component may also provide recommendations with respect to application placement to reduce data transfer, optimize energy consumption or balance occupation among sites (reduce risk of congestion). At Data layer, it advises on data fusion and data reuse options to minimize redundant data exchange.

Sustainable Workload placement and scheduling

It generates recommendations (conventional or AI based) of temporal and spatial workload movement to reduce energy costs and/or carbon emissions. Recommendations are usually oriented towards moving workloads temporarily to energy generation sites where energy is going to be wasted (free or low-cost). For this, it forecasts and matches workload requirements with server capacity at these sites with exceeding energy. It leverages algorithmic to maximize energy efficiency and resource optimization and minimize environmental impact. Workload placement considers also the amount of data communication and latency and cost of migration into the selection of the optimum location. The Scheduler plans and implements the recommendations of the Resource Optimization component, moving workloads temporarily to low-cost energy generation locations, and providing workload execution assurance.  The implementation is integrated into or used by a (de-)central Multi-Cloud Orchestrator, that is the component able to move workload between sites.

Renewable Energy Management

It gives recommendations for near energy production (wind, sun) placement of datacenters and integration of battery storage systems to balance grid capacity.

Cooling and Heat Management

It enhances energy and operational efficiency by integrating cooling technology and real-time optimization of cooling and workload.

Security and Compliance

The security and compliance layer in a cloud environment plays a crucial role in safeguarding data integrity, ensuring that access is strictly regulated, and maintaining adherence to industry standards and regulations. This layer is fundamental in designing a robust cloud infrastructure that not only protects sensitive information but also enforces policies that prevent unauthorized access and misuse. By integrating advanced security protocols, real-time monitoring systems, and compliance frameworks, this layer mitigates risks, responds swiftly to potential threats, and ensures that the cloud environment remains secure and compliant with legal and regulatory requirements. This proactive approach to security and compliance is essential for fostering trust and reliability in cloud services.

Identity and Access Management

In cloud infrastructures, Identity and Access Management (IAM) is a critical framework of policies and technologies that ensure the correct permissions and access rights are assigned to users and devices. This system provides operators with the capability to enforce precise control over resource access, defining who can access specific platform resources and under what conditions. IAM facilitates the automation of user provisioning, the enforcement of robust authentication mechanisms, and the adherence to the principle of least privilege. Coupled with other security measures such as encryption and audit logging, IAM significantly enhances the overall security posture, safeguarding against unauthorized access and potential breaches within the cloud environment.

Identity Management

Identity Management (IdM) in cloud infrastructures focuses on the administration and management of user identities and their associated access rights. It involves the creation, maintenance, and deletion of user accounts, as well as the authentication of user identities and the authorization of their actions within the cloud environment. IdM ensures that only authorized users can access specific resources, while maintaining detailed records of user activities and access patterns. By implementing robust IdM solutions, organizations can enhance security, streamline access management, and achieve compliance with regulatory requirements, thereby protecting sensitive data and preventing unauthorized access.

Key Management Service

Key Management Service (KMS) is an essential component within cloud infrastructures that provides centralized control over the cryptographic keys used to secure data. KMS facilitates the creation, management, and deletion of encryption keys, ensuring that data remains protected both at rest and in transit. By automating key rotation, enforcing access controls, and integrating seamlessly with other cloud services, KMS enhances the security framework by preventing unauthorized access to sensitive information and simplifying compliance with industry standards. This service is vital for maintaining data confidentiality, integrity, and availability within a cloud environment, thereby reinforcing the overall security posture of the infrastructure.

Audit Log

Audit logs are an indispensable component of cloud infrastructures, serving as detailed records of all activities and access events within the environment. These logs capture a comprehensive trail of user actions, system events, and data access, providing essential visibility into the operation and security of the infrastructure. By systematically documenting every interaction, audit logs enable security professionals to perform thorough investigations, identify potential security incidents, and ensure compliance with regulatory standards. The granular information contained in audit logs supports proactive threat detection, forensic analysis, and the continuous improvement of security measures, thereby bolstering the overall integrity and accountability of the cloud environment.

Account Management

In cloud infrastructures, Account Management is the systematic process of overseeing and controlling tenant accounts and their associated privileges. This involves the creation, modification, and deletion of accounts, as well as the assignment and revocation of access rights based on tenant entitlements. By maintaining detailed records of account activities and regularly auditing account permissions, organizations can enhance security, prevent unauthorized access, and ensure compliance with regulatory standards. This proactive management of accounts is crucial for maintaining the integrity and security of the cloud environment.

Service Compliance Verification

Service Compliance Verification in cloud infrastructures is the process of ensuring that all cloud services and their operations adhere to predefined regulatory, security, and organizational standards. This involves conducting regular assessments and audits to verify that the services comply with laws such as GDPR, HIPAA, and industry-specific regulations. Compliance verification includes monitoring service configurations, access controls, and data handling practices to identify and rectify any deviations from the required standards. By implementing comprehensive compliance verification measures, organizations can mitigate legal and security risks, demonstrate due diligence, and maintain trust with stakeholders and customers. This systematic approach is essential for maintaining the integrity, security, and legality of cloud operations.

Management

Effective management of cloud infrastructure necessitates a comprehensive suite of capabilities to ensure optimal performance, security, and compliance. Together, these management capabilities form an integrated framework that supports the scalability, efficiency, and security of cloud services, driving both operational excellence and strategic value.

Logging

Logging in cloud infrastructures refers to the systematic recording of events, transactions, and activities within the cloud environment. This process captures detailed logs of user actions, system operations, and data interactions, creating a repository of information that supports monitoring, auditing, troubleshooting, and security analysis. By maintaining comprehensive and accurate logs, cloud providers and users can trace system behaviors, detect anomalies, and swiftly respond to incidents. Effective logging not only aids in compliance with regulatory requirements but also enhances the overall transparency, reliability, and resilience of the cloud infrastructure.

Monitoring and Alerting

Monitoring and alerting in cloud infrastructures involves continuous observation and real-time analysis of system performance, application behavior, and resource utilization. This process employs various tools and techniques to collect and analyze data from different components of the cloud environment, such as servers, networks, and applications. By setting static or AI-supported thresholds and rules, monitoring systems can detect anomalies, performance bottlenecks, and potential failures. When these conditions are met, alerting mechanisms are triggered to notify administrators and stakeholders promptly, enabling swift resolution and minimizing downtime. Effective monitoring and alerting are essential for maintaining the reliability, availability, and overall health of cloud services, ensuring optimal user experiences and adherence to service level agreements (SLAs). Logging, monitoring, and alerting are foundational, providing detailed records and continuous observation of system activities, which support troubleshooting, security analysis, performance optimization, and pro-active intervention. Coupled with alerting mechanisms, they enable real-time detection and swift resolution of anomalies and potential failures, ensuring high availability and reliability.Logging, monitoring, and alerting could use artifacts from the Data layer to implement its functionality.

Accounting and Charging

Accounting and charging in cloud infrastructure and services refers to the systematic process of tracking and invoicing the usage of cloud services by users and applications. Accounting involves the continuous collection of data on resource consumption, such as CPU usage, memory allocation, storage, and network bandwidth. This data is then analyzed to generate detailed usage reports, which serve as the basis for customer billing. Charging systems apply predefined pricing models and rates to the metered data, ensuring accurate and transparent charges based on actual usage. This process not only provides customers with clear insights into their cloud expenditures but also enables providers to manage resources efficiently and optimize their service offerings. Effective accounting and charging are essential for tracking resource consumption and generating accurate accounting, fostering transparency. This is critical for maintaining financial accountability, fostering trust, and supporting the scalable and on-demand nature of cloud services.

Performance Management

Performance Management in cloud infrastructures refers to the comprehensive process of defining, monitoring, and enforcing Service Level Agreements (SLAs) between cloud service providers and their customers. This applies also to internal performance goals the provider may set internally for its services. This involves setting clear expectations for service performance, availability, and support, and ensuring that these commitments are met consistently. SLA management includes tracking key performance indicators (KPIs), generating compliance reports, and addressing any deviations through corrective actions. Effective performance management not only enhances customer satisfaction and trust but also enables providers to maintain high standards of service quality and reliability, thereby fostering long-term business relationships and competitive advantage. The management of Service Level Agreements (SLAs) ensures that performance and availability commitments are consistently met, enhancing customer satisfaction and trust.

Fault Management

Fault Management in cloud infrastructures involves systematic detection, isolation, and resolution of faults or issues within the cloud environment. This process includes identifying potential failures, diagnosing their root causes, and implementing corrective actions to restore normal operations. Fault management leverages workflow and ticketing systems combined with automated tools and techniques to monitor system components, analyze error logs, and trigger alerts for anomalies. By proactively managing faults, cloud providers can minimize downtime, enhance system reliability, and ensure continuous service delivery, ultimately contributing to the overall robustness and resilience of the cloud infrastructure. Fault management processes detect, diagnose, and resolve issues promptly, contributing to the resilience and robustness of the cloud environment.

Catalog/Repository Management

It provides inventory information, in different layers, about resources used and available and services available and deployed to take decisions. For instance, list of k8s clusters available (with characteristics), list of applications deployed (in which cluster). In addition, Catalogs allow to implement a model driven approach for services and resources.

Operation Automation

This component provides automation for integration, delivery, verification, testing, optimization and other processes in cloud edge environments. It allows us to manage distributed software and infrastructure in an automated way (by code, by software, scripting, declaratively) reducing human errors, making implementations more uniform and predictable and facilitating the reconciliation and recovery to a working configuration after a misconfiguration, disaster or failure. It is usually referred to as XOps (DevOps, MLOps, AIOps, FinOps...).

IPCEI-CIS Reference Architecture

Why a Reference Architecture?

What the Architecture Covers

How to use the Clud Edge Architecture diagram: explore key components and layers in detail!

IPCEI-CIS Reference Architecture

Application Layer

Application Designer

Application Packager

API Gateway

Application Monitoring

Application Catalog

Application Accounting and Billing

Data Layer

Data Pipelines

Data Modelling

Data Exposure

Data Policy Control

Data Catalog

Data Federation

AI Layer

Cloud-Edge Training

Cloud-Edge Inference

Cloud-Edge Agent Manager

AI Model Catalog

Federated Learning

AI Explainability

Service Orchestration

Service Orchestrator

Application Performance Management

Application Repository

Service Federation

Cloud Edge Platform

Multi-Cloud Orchestrator (PaaS)

Cloud Edge Connectivity Manager

Physical Infrastructure Manager

Multi-Cluster Manager

Virtual Infrastructure Platform Manager

Workload Deployment Manager

Cloud Edge Federation

Cloud Edge Access Control

Cloud Edge Resource Repository

Workload Inventory

Serverless Orchestrator (FaaS)

Virtualization

Hardware Resource Manager (BMaaS)

Virtual Infrastructure Manager (IaaS)

Container Infrastructure Service Manager (CaaS)

Virtual Resource Access Control

Virtual Resource Repository

Network Systems, SDN controllers

Physical Cloud Edge Resources Layer

Compute

Storage

Networking

Hardware Infrastructure Manager

Hardware Resource Repository

Physical Network Resources

Sustainability Domain

Benchmark, Metrics and Monitoring

Energy Consumption and carbon emission

Sustainable Workload placement and scheduling

Renewable Energy Management

Cooling and Heat Management

Security and Compliance

Identity and Access Management

Identity Management

Key Management Service

Audit Log

Account Management

Service Compliance Verification

Management

Logging

Monitoring and Alerting

Accounting and Charging

Performance Management

Fault Management

Catalog/Repository Management

Operation Automation