New Paper Accepted: AI Impermanence: Achilles’ Heel for AI Assessment?
Our paper entitled AI Impermanence: Achilles’ Heel for AI Assessment? has been accepted and published at the journal IEEE Access.
The paper draws on interviews with practitioners and academics on the main challenges of assessing AI-based systems. The key takeaway is impermanence. Starting from the interviews, we identified 8 main impermanence-related implications to guide future research on the topic. To the best of our knowledge, this is one of the very first attempt at analyzing what practitioners really think about AI assessment.
This paper is the result of the collaboration between our group at SESAR Lab, Università degli Studi di Milano and Karlsruhe Institute of Technology (KIT) in Germany, in particular with Kathrin Brecker (who then stayed in our lab for 1 month for a research visit), Sebastian Lins, and Ali Sunyaev.
The authors of the paper are Kathrin Brecker , Sebastian Lins, Nicola Bena (me), Claudio A. Ardagna, Marco Anisetti, and Ali Sunyaev.
Below is the full abstract:
Scandals have shown that extant assessment methods (e.g., certifications) cannot cater to the impermanent nature of Artificial Intelligence (AI) systems because of their inherent learning capabilities and adaptability. Current AI assessment methods are only limitedly trustworthy and cannot fulfill their purpose of demonstrating system safety. Our interviews with AI experts from industry and academia help us understand why and how AI impermanence limits assessment in practice. We reveal eight AI impermanence-related implications that threaten the reliability of AI assessment, including challenges for assessment methods, the validity of assessment results, and AI’s self-learning nature that requires ongoing reassessments. Our study contributes to a critical reflection on current AI assessment ideas, illustrating where their validity is at risk owing to AI impermanence. We provide the foundation for the development of assessment methods that consider impermanence-related implications and are suited to fully leveraging AI capabilities for the benefit of society.
A Quick Overview
The core of the paper can be summarized in the following table, where we first identified the three main type of AI impermanence.
| Impermanence Type | Causes | Example Implications for Assessment |
|---|---|---|
| Unintended: change by Accident | Hidden flaws in input data set; changes in real-world conditions; hardware or technology setup changes | Assessment guarantees may be invalidated unexpectedly; difficult to anticipate external changes |
| Occasional: change by update | System enhancements such as model retraining, additions, newly added data | Requires reassessment after updates; challenging to distinguish desired vs. undesired changes |
| Certain: change by design | Self-learning and adaptation during operation | Continuous reassessment needed; high cost and complexity; limited explainability hampers detection |
The detailed set of implications is then shown in this table, grouped according to the impacted assessment function and implication category.
| Assessment Function | Implication Category | AI Impermanence–Related Implication |
|---|---|---|
| Determination |
How to Assess the Consequences of Change: AI Impermanence Challenges Assessment Methods |
No. 1: Undesired changes are challenging for assessors |
| No. 2: Assessors cannot anticipate or test (or can only to a limited extent) the external causes of AI impermanence | ||
| No. 3: Limited explainability of learning systems hampers reassessment in the case of change | ||
| Review & Attestation |
How Reliable Are the Assessment Results: AI Impermanence Threatens Validity |
No. 4: Assessors cannot anticipate and consider rare cases that can have widespread negative impacts |
| No. 5: Assessors need to control for reintroduced human bias over time | ||
| No. 6: Assessors can only limitedly declare systems’ conformity when the system is extended across regions or organizations | ||
| Surveillance |
When and How to Reassess: AI Impermanence Impacts the AI Assessment Validity Period |
No. 7: AI assessment validity depends on changes in user input and external conditions, not only on system changes |
| No. 8: AI systems are intended to change due to their self‑learning nature; surveillance must cope with substantial changes |