🎯 Top Personalized Recommendations
Leiden University
Why we think this paper is great for you:
This paper directly addresses compliance standards under the EU AI Act, which is crucial for understanding the regulatory landscape of AI. It tackles the practical challenges of demonstrating adherence for high-risk AI systems.
Abstract
Robustness is a key requirement for high-risk AI systems under the EU Artificial Intelligence Act (AI Act). However, both its definition and assessment methods remain underspecified, leaving providers with little concrete direction on how to demonstrate compliance. This stems from the Act's horizontal approach, which establishes general obligations applicable across all AI systems, but leaves the task of providing technical guidance to harmonised standards. This paper investigates what it means for AI systems to be robust and illustrates the need for context-sensitive standardisation. We argue that robustness is not a fixed property of a system, but depends on which aspects of performance are expected to remain stable ("robustness of what"), the perturbations the system must withstand ("robustness to what") and the operational environment. We identify three contextual drivers--use case, data and model--that shape the relevant perturbations and influence the choice of tests, metrics and benchmarks used to evaluate robustness. The need to provide at least a range of technical options that providers can assess and implement in light of the system's purpose is explicitly recognised by the standardisation request for the AI Act, but planned standards, still focused on horizontal coverage, do not yet offer this level of detail. Building on this, we propose a context-sensitive multi-layered standardisation framework where horizontal standards set common principles and terminology, while domain-specific ones identify risks across the AI lifecycle and guide appropriate practices, organised in a dynamic repository where providers can propose new informative methods and share lessons learned. Such a system reduces the interpretative burden, mitigates arbitrariness and addresses the obsolescence of static standards, ensuring that robustness assessment is both adaptable and operationally meaningful.
AI Summary - The EU AI Act's horizontal approach to robustness standardisation is insufficient, leading to vague definitions and underspecified assessment methodologies for high-risk AI systems. [2]
- Robustness is not a fixed property but is inherently context-sensitive, determined by 'robustness of what' (performance aspects), 'robustness to what' (perturbations), and the specific operational environment. [2]
- Three critical contextual drivers—use case, data characteristics, and model architecture—must guide the identification of relevant perturbations and the selection of appropriate robustness tests, metrics, and benchmarks. [2]
- A multi-layered standardisation framework is proposed, combining horizontal principles with domain-specific provisions and a dynamic repository for best practices, benchmarks, and sandboxes, to provide actionable guidance. [2]
- The dynamic repository is crucial for addressing the rapid obsolescence of static standards, enabling continuous updates, stakeholder contributions, and shared learning on effective robustness assessment methods. [2]
- Even within similar application domains (e.g., medical image classification), subtle shifts in deployment assumptions necessitate drastically different robustness evaluation strategies (e.g., adversarial attacks vs. [2]
- domain shift). [2]
- Effective robustness assessment requires a context-sensitive perturbation taxonomy, mapping potential threats across the AI system's lifecycle and linking them to specific mitigation strategies. [2]
- Context-sensitive robustness: The idea that robustness is not a fixed property but depends on the system's intended purpose, design, deployment environment, and the specific perturbations relevant for evaluation. [2]
- Robustness of what: The specific aspects of a system's performance (e.g., accuracy, precision, recall, F1-score) that are expected to remain stable under varying conditions. [2]
- Robustness to what: The types of input changes or perturbations (e.g., unseen, biased, adversarial, invalid data, environmental conditions) that a system is required to withstand. [2]
Bauplan Labs
Why we think this paper is great for you:
This paper explores the path to trustworthy AI, specifically focusing on governance within enterprise settings. It discusses foundational infrastructure challenges for ensuring reliable AI systems.
Abstract
Even as AI capabilities improve, most enterprises do not consider agents trustworthy enough to work on production data. In this paper, we argue that the path to trustworthy agentic workflows begins with solving the infrastructure problem first: traditional lakehouses are not suited for agent access patterns, but if we design one around transactions, governance follows. In particular, we draw an operational analogy to MVCC in databases and show why a direct transplant fails in a decoupled, multi-language setting. We then propose an agent-first design, Bauplan, that reimplements data and compute isolation in the lakehouse. We conclude by sharing a reference implementation of a self-healing pipeline in Bauplan, which seamlessly couples agent reasoning with all the desired guarantees for correctness and trust.
UC Berkeley
Why we think this paper is great for you:
This paper offers principles for the responsible use of AI, directly aligning with your focus on ensuring ethical deployment. It provides a framework for navigating the societal impact of AI.
Abstract
Artificial intelligence (AI) is no longer futuristic; it is a daily companion shaping our private and work lives. While AI simplifies our lives, its rise also invites us to rethink who we are - and who we wish to remain - as humans. Even if AI does not think, feel, or desire, it learns from our behavior, mirroring our collective values, biases, and aspirations. The question, then, is not what AI is, but what we are allowing it to become through data, computing power, and other parameters "teaching" it - and, even more importantly, who we are becoming through our relationship with AI.
As the EU AI Act and the Vienna Manifesto on Digital Humanism emphasize, technology must serve human dignity,social well-being, and democratic accountability. In our opinion, responsible use of AI is not only a matter of code nor law, but also of conscientious practice: how each of us engages and teaches others to use AI at home and at work. We propose Ten Commandments for the Wise and Responsible Use of AI are meant as guideline for this very engagement. They closely align with Floridi and Cowls' five guiding principles for AI in society - beneficence, non-maleficence, autonomy, justice, and explicability.
McGill University
Why we think this paper is great for you:
This paper examines how Large Language Models are fundamentally reshaping organizational knowledge, which is highly relevant to understanding their broader impact. It explores the transformative effects of these powerful models within organizations.
Abstract
Large Language Models (LLMs) are reshaping organizational knowing by unsettling the epistemological foundations of representational and practice-based perspectives. We conceptualize LLMs as Haraway-ian monsters, that is, hybrid, boundary-crossing entities that destabilize established categories while opening new possibilities for inquiry. Focusing on analogizing as a fundamental driver of knowledge, we examine how LLMs generate connections through large-scale statistical inference. Analyzing their operation across the dimensions of surface/deep analogies and near/far domains, we highlight both their capacity to expand organizational knowing and the epistemic risks they introduce. Building on this, we identify three challenges of living with such epistemic monsters: the transformation of inquiry, the growing need for dialogical vetting, and the redistribution of agency. By foregrounding the entangled dynamics of knowing-with-LLMs, the paper extends organizational theory beyond human-centered epistemologies and invites renewed attention to how knowledge is created, validated, and acted upon in the age of intelligent technologies.
University College London
Why we think this paper is great for you:
This paper explores the practical applications and ethical considerations of ChatGPT, providing insights into how these models are used and the challenges they present. It is highly relevant to your interest in large language models and their responsible deployment.
Abstract
ChatGPT has been increasingly used in computer science, offering efficient support across software development tasks. While it helps students navigate programming challenges, its use also raises concerns about academic integrity and overreliance. Despite growing interest in this topic, prior research has largely relied on surveys, emphasizing trends over in-depth analysis of students' strategies and ethical awareness. This study complements existing work through a qualitative investigation of how computer science students in one UK institution strategically and ethically engage with ChatGPT in software development projects. Drawing on semi-structured interviews, it explores two key questions: How do computer science students ethically and strategically report using ChatGPT in software development projects? How do students understand and perceive the ethical issues associated with using ChatGPT in academic and professional contexts? Findings reveal a shift in students' learning models, moving from traditional "independent thinking-manual coding-iterative debugging" to "AI-assisted ideation-interactive programming-collaborative optimization." Importantly, many use ChatGPT conversationally to deepen understanding, while consciously reserving creative and high-level decision-making tasks for themselves. Students tend to cap ChatGPT's contribution to roughly 30%, and evaluate its output to mitigate overreliance. However, only a minority thoroughly analyze AI-generated code, raising concerns about reduced critical engagement. Meanwhile, students reject uncredited use, highlight risks such as privacy breaches and skill degradation, and call for clear usage guidelines set by their teachers. This research offers novel insights into the evolving learner-AI dynamic and highlights the need for explicit guidance to support responsible and pedagogically sound use of such tools.
Pontifical Catholic Unv
Why we think this paper is great for you:
This paper investigates the use of Large Language Models to assist qualitative research, offering insights into their practical application and methodological considerations. It directly addresses the opportunities and limitations of LLMs in analytical tasks.
Abstract
[Context] Large Language Models (LLMs) are increasingly used to assist qualitative research in Software Engineering (SE), yet the methodological implications of this usage remain underexplored. Their integration into interpretive processes such as thematic analysis raises fundamental questions about rigor, transparency, and researcher agency. [Objective] This study investigates how experienced SE researchers conceptualize the opportunities, risks, and methodological implications of integrating LLMs into thematic analysis. [Method] A reflective workshop with 25 ISERN researchers guided participants through structured discussions of LLM-assisted open coding, theme generation, and theme reviewing, using color-coded canvases to document perceived opportunities, limitations, and recommendations. [Results] Participants recognized potential efficiency and scalability gains, but highlighted risks related to bias, contextual loss, reproducibility, and the rapid evolution of LLMs. They also emphasized the need for prompting literacy and continuous human oversight. [Conclusion] Findings portray LLMs as tools that can support, but not substitute, interpretive analysis. The study contributes to ongoing community reflections on how LLMs can responsibly enhance qualitative research in SE.
Huawei
Why we think this paper is great for you:
This paper delves into the fundamental nature and limits of AI from an engineering perspective, offering a deeper understanding of the technology. It provides a foundational view that can inform your broader work with AI systems.
Abstract
As a capability coming from computation, how does AI differ fundamentally from the capabilities delivered by rule-based software program? The paper examines the behavior of artificial intelligence (AI) from engineering points of view to clarify its nature and limits. The paper argues that the rationality underlying humanity's impulse to pursue, articulate, and adhere to rules deserves to be valued and preserved. Identifying where rule-based practical rationality ends is the beginning of making it aware until action. Although the rules of AI behaviors are still hidden or only weakly observable, the paper has proposed a methodology to make a sense of discrimination possible and practical to identify the distinctions of the behavior of AI models with three types of decisions. It is a prerequisite for human responsibilities with alternative possibilities, considering how and when to use AI. It would be a solid start for people to ensure AI system soundness for the well-being of humans, society, and the environment.