Below you can find a list of my publications. Most of them are open access (green at least), accessible at the PDF link. The AIR link points to the paper entry in my university institutional repository.
Modern applications are increasingly driven by Machine Learning (ML) models whose non-deterministic behavior is affecting the entire application life cycle from design to operation. The pervasive adoption of ML is urgently calling for approaches that guarantee a stable non-functional behavior of ML-based applications over time and across model changes. To this aim, non-functional properties of ML models, such as privacy, confidentiality, fairness, and explainability, must be monitored, verified, and maintained. Existing approaches mostly focus on i) implementing solutions for classifier selection according to the functional behavior of ML models, ii) finding new algorithmic solutions, such as continuous re-training. In this paper, we propose a multi-model approach that aims to guarantee a stable non-functional behavior of ML-based applications. An architectural and methodological approach is provided to compare multiple ML models showing similar non-functional properties and select the model supporting stable non-functional behavior over time according to (dynamic and unpredictable) contextual changes. Our approach goes beyond the state of the art by providing a solution that continuously guarantees a stable non-functional behavior of ML-based applications, is ML algorithm-agnostic, and is driven by non-functional properties assessed on the ML models themselves. It consists of a two-step process working during application operation, where model assessment verifies non-functional properties of ML models trained and selected at development time, and model substitution guarantees continuous and stable support of non-functional properties. We experimentally evaluate our solution in a real-world scenario focusing on non-functional property fairness.
Certifying Accuracy, Privacy, and Robustness of ML-Based Malware Detection
Bena, Nicola,
Anisetti, Marco,
Gianini, Gabriele,
and Ardagna, Claudio A.
Recent advances in artificial intelligence (AI) are radically changing how systems and applications are designed and developed. In this context, new requirements and regulations emerge, such as the AI Act, placing increasing focus on strict non-functional requirements, such as privacy and robustness, and how they are verified. Certification is considered the most suitable solution for non-functional verification of modern distributed systems, and is increasingly pushed forward in the verification of AI-based applications. In this paper, we present a novel dynamic malware detector driven by the requirements in the AI Act, which goes beyond standard support for high accuracy, and also considers privacy and robustness. Privacy aims to limit the need of malware detectors to examine the entire system in depth requiring administrator-level permissions; robustness refers to the ability to cope with malware mounting evasion attacks to escape detection. We then propose a certification scheme to evaluate non-functional properties of malware detectors, which is used to comparatively evaluate our malware detector and two representative deep-learning solutions in literature.
Revisiting Trust Management in the Data Economy: A Road Map
Ardagna, Claudio A.,
Bena, Nicola,
Bennani, Nadia,
Ghedira-Guegan, Chirine,
Grecchi, Nicolò,
and Vargas-Solar, Genoveva
In the last two decades, multiple information and communications technology evolutions have boosted the ability to collect and analyze vast numbers of data (on the order of zettabytes). Collectively, they have paved the way for the so-called data economy, revolutionizing most sectors of our society, including health care, transportation, and grids. At the core of this revolution, distributed data-intensive applications compose services operated by multiple parties in the cloud-edge continuum; they process, manage, and exchange massive numbers of data at an unprecedented rate. However, data hold little value without adequate data protection. Traditional solutions, which aim to balance data quality and protection, are insufficient to address the peculiarities of the data economy, including trustworthy data sharing and management, composite service support, and multiparty data lifecycle. This article analyzes how trust management systems (TMSs) can regain the lead in supporting trustworthy data-intensive applications, discussing current challenges and proposing a road map for new-generation TMSs in the data economy.
Rethinking Certification for Trustworthy Machine-Learning-Based Applications
Anisetti, Marco,
Ardagna, Claudio A.,
Bena, Nicola,
and Damiani, Ernesto
Machine learning (ML) is increasingly used to implement advanced applications with nondeterministic behavior, which operate on the cloud-edge continuum. The pervasive adoption of ML is urgently calling for assurance solutions to assess applications’ nonfunctional properties (e.g., fairness, robustness, and privacy) with the aim of improving their trustworthiness. Certification has been clearly identified by policy makers, regulators, and industrial stakeholders as the preferred assurance technique to address this pressing need. Unfortunately, existing certification schemes are not immediately applicable to nondeterministic applications built on ML models. This article analyzes the challenges and deficiencies of current certification schemes, discusses open research issues, and proposes a first certification scheme for ML-based applications.
On the Robustness of Random Forest Against Untargeted Data Poisoning: An Ensemble-Based Approach
Anisetti, Marco,
Ardagna, Claudio A.,
Balestrucci, Alessandro,
Bena, Nicola,
Damiani, Ernesto,
and Yeun, Chan Yeob
IEEE Transactions on Sustainable Computing,
vol. 8, no. 4,
2023
Machine learning is becoming ubiquitous. From finance to medicine, machine learning models are boosting decision/making processes and even outperforming humans in some tasks. This huge progress in terms of prediction quality does not however find a counterpart in the security of such models and corresponding predictions, where perturbations of fractions of the training set (poisoning) can seriously undermine the model accuracy. Research on poisoning attacks and defenses received increasing attention in the last decade, leading to several promising solutions aiming to increase the robustness of machine learning. Among them, ensemble-based defenses, where different models are trained on portions of the training set and their predictions are then aggregated, provide strong theoretical guarantees at the price of a linear overhead. Surprisingly, ensemble-based defenses, which do not pose any restrictions on the base model, have not been applied to increase the robustness of random forest models. The work in this paper aims to fill in this gap by designing and implementing a novel hash-based ensemble approach that protects random forest against untargeted, random poisoning attacks. An extensive experimental evaluation measures the performance of our approach against a variety of attacks, as well as its sustainability in terms of resource consumption and performance, and compares it with a traditional monolithic model based on random forest. A final discussion presents our main findings and compares our approach with existing poisoning defenses targeting random forests.
Big Data Assurance: An Approach Based on Service-Level Agreements
Ardagna, Claudio A.,
Bena, Nicola,
Hebert, Cedric,
Krotsiani, Maria,
Kloukinas, Christos,
and Spanoudakis, George
Big data management is a key enabling factor for enterprises that want to compete in the global market. Data coming from enterprise production processes, if properly analyzed, can provide a boost in the enterprise management and optimization, guaranteeing faster processes, better customer management, and lower overheads/costs. Guaranteeing a proper big data pipeline is the holy grail of big data, often opposed by the difficulty of evaluating the correctness of the big data pipeline results. This problem is even worse when big data pipelines are provided as a service in the cloud, and must comply with both laws and users’ requirements. To this aim, assurance techniques can complete big data pipelines, providing the means to guarantee that they behave correctly, toward the deployment of big data pipelines fully compliant with laws and users’ requirements. In this article, we define an assurance solution for big data based on service-level agreements, where a semiautomatic approach supports users from the definition of the requirements to the negotiation of the terms regulating the provisioned services, and the continuous refinement thereof.
Multi-Dimensional Certification of Modern Distributed Systems
Anisetti, Marco,
Ardagna, Claudio A.,
and Bena, Nicola
IEEE Transactions on Services Computing,
vol. 16, no. 3,
2023
The cloud computing has deeply changed how distributed systems are engineered, leading to the proliferation of ever-evolving and complex environments, where legacy systems, microservices, and nanoservices coexist. These services can severely impact on individuals’ security and safety, introducing the need of solutions that properly assess and verify their correct behavior. Security assurance stands out as the way to address such pressing needs, with certification techniques being used to certify that a given service holds some non-functional properties. However, existing techniques build their evaluation on software artifacts only, falling short in providing a thorough evaluation of the non-functional properties under certification. In this paper, we present a multi-dimensional certification scheme where additional dimensions model relevant aspects (e.g., programming languages and development processes) that significantly contribute to the quality of the certification results. Our multi-dimensional certification enables a new generation of service selection approaches capable to handle a variety of user’s requirements on the full system life cycle, from system development to its operation and maintenance. The performance and the quality of our approach are thoroughly evaluated in several experiments.
Explainable Data Poison Attacks on Human Emotion Evaluation Systems Based on EEG Signals
Zhang, Zhibo,
Umar, Sani,
Hammadi, Ahmed Y. Al,
Yoon, Sangyoung,
Damiani, Ernesto,
Ardagna, Claudio A.,
Bena, Nicola,
and Yeun, Chan Yeob
The major aim of this paper is to explain the data poisoning attacks using label-flipping during the training stage of the electroencephalogram (EEG) signal-based human emotion evaluation systems deploying Machine Learning models from the attackers’ perspective. Human emotion evaluation using EEG signals has consistently attracted a lot of research attention. The identification of human emotional states based on EEG signals is effective to detect potential internal threats caused by insider individuals. Nevertheless, EEG signal-based human emotion evaluation systems have shown several vulnerabilities to data poison attacks. Besides, due to the instability and complexity of the EEG signals, it is challenging to explain and analyze how data poison attacks influence the decision process of EEG signal-based human emotion evaluation systems. In this paper, from the attackers’ side, data poison attacks occurring in the training phases of six different Machine Learning models including Random Forest, Adaptive Boosting (AdaBoost), Extra Trees, XGBoost, Multilayer Perceptron (MLP), and K-Nearest Neighbors (KNN) intrude on the EEG signal-based human emotion evaluation systems using these Machine Learning models. This seeks to reduce the performance of the aforementioned Machine Learning models with regard to the classification task of 4 different human emotions using EEG signals. The findings of the experiments demonstrate that the suggested data poison assaults are model-independently successful, although various models exhibit varying levels of resilience to the attacks. In addition, the data poison attacks on the EEG signal-based human emotion evaluation systems are explained with several Explainable Artificial Intelligence (XAI) methods including Shapley Additive Explanation (SHAP) values, Local Interpretable Model-agnostic Explanations (LIME), and Generated Decision Trees. And the codes of this paper are publicly available on GitHub.
International Conferences
Decolonizing Federated Learning: Designing Fair and Responsible Resource Allocation
Vargas-Solar, Genoveva,
Bennani, Nadia,
Espinosa-Oviedo, Javier. A.,
Mauri, Andrea,
Zechinelli-Martini, J.-L.,
Catania, Barbara,
Ardagna, Claudio,
and Bena, Nicola
In Proc. of AICCSA 2024,
Sousse, Tunisia,
Oct.
2024
This position paper explores the challenges, existing solutions, and open issues related to resource allocation in federated learning environments. The focus is on how to allocate resources effectively while adhering to service level objectives (SLOs) and fairness requirements, which include factors Such as server location, data provenance, energy consumption, sovereignty, carbon footprint, and economic cost. The goal is to optimise resource distribution across different stages of the federated learning process within a given architecture, ensuring that these fairness criteria are integrated into the allocation strategy. This approach aligns with decolonial methodologies that seek to offer more sustainable and equitable alternatives to the resource-intensive artificial intelligence processes prevalent today.
A Transparent Certification Scheme Based on Blockchain for Service-Based Systems
Modern service-based systems are characterized by applications composed of heterogeneous services provided by multiple, untrusted providers, and deployed along the (multi-) cloud-edge continuum. This scenario of increasing pervasiveness, complexity, and multi-party service recruitment urgently calls for solutions to increase applications privacy and security, on the one hand, and guarantee that applications behave as expected and support a given set of non-functional requirements, on the other hand. Certification schemes became the widespread means to answer this call, but they still build on old-fashioned assumptions that hardly hold in today’s services world. They assume that all actors involved in a certification process are trusted "by definition", meaning that certificates are supposed to be correct and be safely usable for decision-making, such as certification-based service selection and composition. In this paper, we depart from such unrealistic assumptions and define the first certification scheme that is completely transparent to the involved actors and significantly more resistant to misbehavior (e.g., collusion). We design a blockchain-based architecture to support our scheme, re-defining the actors and their roles. The quality and performance of our scheme are evaluated in a case study scenario.
Continuous Certification of Non-Functional Properties Across System Changes
Anisetti, Marco,
Ardagna, Claudio A.,
and Bena, Nicola
Existing certification schemes implement continuous verification techniques aiming to prove non-functional (e.g., security) properties of software systems over time. These schemes provide different re-certification techniques for managing the certificate life cycle, though their strong assumptions make them ineffective against modern service-based distributed systems. Re-certification techniques are in fact built on static system models, which do not properly represent the system evolution, and on static detection of system changes, which results in an inaccurate planning of re-certification activities. In this paper, we propose a continuous certification scheme that departs from a static certificate life cycle management and provides a dynamic approach built on the modeling of the system behavior that reduces the amount of unnecessary re-certification. The quality of the proposed scheme is experimentally evaluated using an ad hoc dataset built on publicly-available datasets.
Non-Functional Certification of Modern Distributed Systems: A Research Manifesto
Ardagna, Claudio A.,
and Bena, Nicola
In Proc. of IEEE SSE 2023,
Chicago, IL, USA,
Jul.
2023
The huge progress of ICT is radically changing distributed systems at their roots, modifying their operation and engineering practices and introducing new non-functional (e.g., security and safety) risks. These risks are amplified by the crucial role played by machine learning, on one side, and by the pervasive involvement of users in the system operation, on the other side. Certification techniques have been largely adopted to reduce the above risks, though the recent evolution of distributed systems towards cloud-edge, IoT, 5G, and machine learning severely hindered certification diffusion and quality. The need of new certification techniques that prove compliance of distributed systems against non-functional requirements arises and is often pushed by strict laws and regulations. In this paper, we envision a research manifesto for non-functional certification of modern distributed systems that paves the way for the wide adoption of certification in the real world, also in those domains where certification is not mandatory. Its ultimate goal is to lead to a trustworthy and adaptive ecosystem based on a cost-effective, non-functional certification, where modern system development, assessment, and management are not only ruled by functional requirements. The manifesto discusses the research challenges, a roadmap built on 6 research directions, and a concrete implementation timeline for the roadmap.
Lightweight Behavior-Based Malware Detection
Anisetti, Marco,
Ardagna, Claudio A.,
Bena, Nicola,
Giandomenico, Vincenzo,
and Gianini, Gabriele
In Proc. of MEDES 2023,
Heraklion, Greece,
May.
2023
Modern malware detection tools rely on special permissions to collect data that can reveal the presence of suspicious software within a machine. Typical data that they collect for this task are the set of system calls, the content of network traffic, file system changes, and API calls. However, giving access to these data to an externally created program means granting the company that created that software complete control over the host machine. This is undesirable for many reasons. In this work, we propose an alternative approach for this task, which relies on easily accessible data, information about system performances (CPU, RAM, disk, and network usage), and does not need high-level permissions to be collected. To investigate the effectiveness of this approach, we collected these data in the form of a multivalued time series and ran a number of malware programs in a suitably devised sandbox. Then - to address the fact that deep learning models need large training sets - we augmented the dataset using a deep learning generative model (a Generative Adversarial Network). Finally, we trained an LSTM (Long Short Term Memory) network to capture the malware behavioral patterns. Our investigation found that this approach, based on easy-to-collect information, is very effective (we achieved 0.99 accuracy), despite the fact that the data used for training the detector are substantially different from the ones specifically targeted for this purpose. The real and synthetic datasets, as well as corresponding source code, are publicly available.
Bridging the Gap Between Certification and Software Development
Ardagna, Claudio A.,
Bena, Nicola,
and Pozuelo, Ramon Martín
While certification is widely recognized as a means to increase system trustworthiness and reduce uncertainty in decision making, it faces severe challenges preventing a wider adoption thereof. Certification is not adequately planned and integrated within the development process, leading to suboptimal scenarios where certification introduces the need to further modify the developed system with high costs. We propose a methodology that bridges the gap between software development and certification processes. Our methodology automatically produces the certification requirements driving all steps of the development process, and maximizes the strength of certificates while taking costs under control. We formalize the above problem as a multi-objective mathematical program and solve it through a genetic algorithm. The proposed approach is tested in a real-world, cloud-based financial scenario at CaixaBank and its performance and quality is evaluated in a simulated scenario.
A DevSecOps-based Assurance Process for Big Data Analytics
Anisetti, Marco,
Bena, Nicola,
Berto, Filippo,
and Jeon, Gwanggil
In Proc. of IEEE ICWS 2022,
Barcelona, Spain,
Jul.
2022
Today big data pipelines are increasingly adopted by service applications representing a key enabler for enterprises to compete in the global market. However, the management of non-functional aspects of the big data pipeline (e.g., security, privacy) is still in its infancy. As a consequence, while functionally appealing, the big data pipeline does not provide a transparent environment, impairing the users’ ability to evaluate its behavior. In this paper, we propose a security assurance methodology for big data pipelines grounded on the DevSecOps development paradigm to increase trustworthiness allowing reliable security and privacy by design. Our methodology models and annotates big data pipelines with non-functional requirements verified by assurance checks ensuring requirements to hold along with the pipeline lifecycle. The performance and quality of our methodology are evaluated in a real walkthrough analytics scenario.
Security Assurance in Modern IoT Systems
Bena, Nicola,
Bondaruc, Ruslan,
and Polimeno, Antongiacomo
In Proc. of IEEE VTC 2022-Spring,
Helsinki, Finland,
Jun.
2022
Modern distributed systems consist of a multi-layer architecture of IoT, edge, and cloud nodes. Together, they are revolutionizing our lives, bringing intelligence to existing processes (e.g., smart grids) and enabling novel, efficient and effective processes (e.g., remote surgery). This transition however does not come without drawbacks, due to the ever-increasing reliance on devices whose security and safety are, at least, questionable. In this context, research is in its infancy, struggling to adapt successful practices applied, for instance, in cloud systems. Security of modern IoT systems still relies on old-fashioned approaches, mostly static assessments considering only very specific parts of the target system, rather than assessing the system as a whole. In this paper, we put forward the idea of security assurance for IoT, as a higher-level assurance process evaluating the target system at different layers and different moments of its lifecycle, then implemented by a flexible assurance framework. The quality of our approach is evaluated in a real-world smart lighting system.
Towards an Assurance Framework for Edge and IoT Systems
Anisetti, Marco,
Ardagna, Claudio A.,
Bena, Nicola,
and Bondaruc, Ruslan
In Proc. of IEEE EDGE 2021,
Guangzhou, China,
Dec.
2021
Current distributed systems increasingly rely on hybrid architectures built on top of IoT, edge, and cloud, backed by dynamically configurable networking technologies like 5G. In this complex environment, traditional security governance solutions cannot provide the holistic view that is needed to manage these systems in an effective and efficient way. In this paper, we propose a security assurance framework for edge and IoT systems based on an advanced architecture capable to deal with 5G-native applications.
An Assurance-Based Risk Management Framework for Distributed Systems
Anisetti, Marco,
Ardagna, Claudio A.,
Bena, Nicola,
and Foppiani, Andrea
In Proc. of IEEE ICWS 2021,
Chicago, IL, USA,
Sep.
2021
The advent of cloud computing and Internet of Things (IoT) has deeply changed the design and operation of IT systems, affecting mature concepts like trust, security, and privacy. The benefits in terms of new services and applications come at a price of new fundamental risks, and the need of adapting risk management frameworks to properly understand and address them. While research on risk management is an established practice that dates back to the 90s, many of the existing frameworks do not even come close to address the intrinsic complexity and heterogeneity of modern systems. They rather target static environments and monolithic systems thus undermining their usefulness in real-world use cases. In this paper, we present an assurance-based risk management framework that addresses the requirements of risk management in modern distributed systems. The proposed framework implements a risk management process integrated with assurance techniques. Assurance techniques monitor the correct behavior of the target system, that is, the correct working of the mechanisms implemented by the organization to mitigate the risk. Flow networks compute risk mitigation and retrieve the residual risk for the organization. The performance and quality of the framework are evaluated in a simulated industry 4.0 scenario.
Stay Thrifty, Stay Secure: A VPN-based Assurance Framework for Hybrid Systems
Anisetti, Marco,
Ardagna, Claudio,
Bena, Nicola,
and Damiani, Ernesto
In Proc. of SECRYPT 2020,
Lieusaint - Paris, France,
Jul.
2020
Security assurance provides a wealth of techniques to demonstrate that a target system holds some nonfunctional properties and behaves as expected. These techniques have been recently applied to the cloud ecosystem, while encountering some critical issues that reduced their benefit when hybrid systems, mixing public and private infrastructures, are considered. In this paper, we present a new assurance framework that evaluates the trustworthiness of hybrid systems, from traditional private networks to public clouds. It implements an assurance process that relies on a Virtual Private Network (VPN)-based solution to smoothly integrate with the target systems. The assurance process provides a transparent and non-invasive solution that does not interfere with the working of the target system. The performance of the framework have been experimentally evaluated in a simulated scenario.
An Assurance Framework and Process for Hybrid Systems
Anisetti, Marco,
Ardagna, Claudio A.,
Bena, Nicola,
and Damiani, Ernesto
In Proc. of ICETE 2020,
Lieusaint - Paris, France,
Jul.
2020
Security assurance is a discipline aiming to demonstrate that a target system holds some non/functional properties and behaves as expected. These techniques have been recently applied to the cloud, facing some critical issues especially when integrated within existing security processes and executed in a programmatic way. Furthermore, they pose significant costs when hybrid systems, mixing public and private infrastructures, are considered. In this paper, we present an assurance framework that implements an assurance process evaluating the trustworthiness of hybrid systems. The framework builds on a standard API-based interface supporting full and programmatic access to the functionalities of the framework. The process provides a transparent, non-invasive and automatic solution that does not interfere with the working of the target system. It builds on a Virtual Private Network (VPN)-based solution, to provide a smooth integration with target systems, in particular those mixing public and private clouds and corporate networks. A detailed walkthrough of the process along with a performance evaluation of the framework in a simulated scenario are presented.
Other Publications
Location Information (Privacy of)
Ardagna, Claudio A.,
and Bena, Nicola
In Encyclopedia of Cryptography, Security and Privacy,
Jajodia, Sushil and Samarati, Pierangela and Yung, Moti (eds.),
2021