Hi j34nc4rl0+ai_ethics,

Here is our personalized paper recommendations for you sorted by most relevant
AI Transparency
Paper visualization
Abstract
Current eXplainable AI (XAI) methods largely serve developers, often focusing on justifying model outputs rather than supporting diverse stakeholder needs. A recent shift toward Evaluative AI reframes explanation as a tool for hypothesis testing, but still focuses primarily on operational organizations. We introduce Holistic-XAI (H-XAI), a unified framework that integrates causal rating methods with traditional XAI methods to support explanation as an interactive, multi-method process. H-XAI allows stakeholders to ask a series of questions, test hypotheses, and compare model behavior against automatically constructed random and biased baselines. It combines instance-level and global explanations, adapting to each stakeholder's goals, whether understanding individual decisions, assessing group-level bias, or evaluating robustness under perturbations. We demonstrate the generality of our approach through two case studies spanning six scenarios: binary credit risk classification and financial time-series forecasting. H-XAI fills critical gaps left by existing XAI methods by combining causal ratings and post-hoc explanations to answer stakeholder-specific questions at both the individual decision level and the overall model level.
Abstract
Automated grading systems can efficiently score short-answer responses, yet they often fail to indicate when a grading decision is uncertain or potentially contentious. We introduce semantic entropy, a measure of variability across multiple GPT-4-generated explanations for the same student response, as a proxy for human grader disagreement. By clustering rationales via entailment-based similarity and computing entropy over these clusters, we quantify the diversity of justifications without relying on final output scores. We address three research questions: (1) Does semantic entropy align with human grader disagreement? (2) Does it generalize across academic subjects? (3) Is it sensitive to structural task features such as source dependency? Experiments on the ASAP-SAS dataset show that semantic entropy correlates with rater disagreement, varies meaningfully across subjects, and increases in tasks requiring interpretive reasoning. Our findings position semantic entropy as an interpretable uncertainty signal that supports more transparent and trustworthy AI-assisted grading workflows.
AI Bias
Paper visualization
Abstract
An emerging field of AI, namely Fair Machine Learning (ML), aims to quantify different types of bias (also known as unfairness) exhibited in the predictions of ML algorithms, and to design new algorithms to mitigate them. Often, the definitions of bias used in the literature are observational, i.e. they use the input and output of a pre-trained algorithm to quantify a bias under concern. In reality,these definitions are often conflicting in nature and can only be deployed if either the ground truth is known or only in retrospect after deploying the algorithm. Thus,there is a gap between what we want Fair ML to achieve and what it does in a dynamic social environment. Hence, we propose an alternative dynamic mechanism,"Fair Game",to assure fairness in the predictions of an ML algorithm and to adapt its predictions as the society interacts with the algorithm over time. "Fair Game" puts together an Auditor and a Debiasing algorithm in a loop around an ML algorithm. The "Fair Game" puts these two components in a loop by leveraging Reinforcement Learning (RL). RL algorithms interact with an environment to take decisions, which yields new observations (also known as data/feedback) from the environment and in turn, adapts future decisions. RL is already used in algorithms with pre-fixed long-term fairness goals. "Fair Game" provides a unique framework where the fairness goals can be adapted over time by only modifying the auditor and the different biases it quantifies. Thus,"Fair Game" aims to simulate the evolution of ethical and legal frameworks in the society by creating an auditor which sends feedback to a debiasing algorithm deployed around an ML system. This allows us to develop a flexible and adaptive-over-time framework to build Fair ML systems pre- and post-deployment.
Abstract
Conformal predictors are machine learning algorithms developed in the 1990's by Gammerman, Vovk, and their research team, to provide set predictions with guaranteed confidence level. Over recent years, they have grown in popularity and have become a mainstream methodology for uncertainty quantification in the machine learning community. From its beginning, there was an understanding that they enable reliable machine learning with well-calibrated uncertainty quantification. This makes them extremely beneficial for developing trustworthy AI, a topic that has also risen in interest over the past few years, in both the AI community and society more widely. In this article, we review the potential for conformal prediction to contribute to trustworthy AI beyond its marginal validity property, addressing problems such as generalization risk and AI governance. Experiments and examples are also provided to demonstrate its use as a well-calibrated predictor and for bias identification and mitigation.
AI Ethics
Abstract
The conceptual framework proposed in this paper centers on the development of a deliberative moral reasoning system - one designed to process complex moral situations by generating, filtering, and weighing normative arguments drawn from diverse ethical perspectives. While the framework is rooted in Machine Ethics, it also makes a substantive contribution to Value Alignment by outlining a system architecture that links structured moral reasoning to action under time constraints. Grounded in normative moral pluralism, this system is not constructed to imitate behavior but is built on reason-sensitive deliberation over structured moral content in a transparent and principled manner. Beyond its role as a deliberative system, it also serves as the conceptual foundation for a novel two-level architecture: functioning as a moral reasoning teacher envisioned to train faster models that support real-time responsiveness without reproducing the full structure of deliberative reasoning. Together, the deliberative and intuitive components are designed to enable both deep reflection and responsive action. A key design feature is the dual-hybrid structure: a universal layer that defines a moral threshold through top-down and bottom-up learning, and a local layer that learns to weigh competing considerations in context while integrating culturally specific normative content, so long as it remains within the universal threshold. By extending the notion of moral complexity to include not only conflicting beliefs but also multifactorial dilemmas, multiple stakeholders, and the integration of non-moral considerations, the framework aims to support morally grounded decision-making in realistic, high-stakes contexts.
Abstract
As AI systems become increasingly embedded in organizational workflows and consumer applications, ethical principles such as fairness, transparency, and robustness have been widely endorsed in policy and industry guidelines. However, there is still scarce empirical evidence on whether these principles are recognized, valued, or impactful from the perspective of users. This study investigates the link between ethical AI and user satisfaction by analyzing over 100,000 user reviews of AI products from G2. Using transformer-based language models, we measure sentiment across seven ethical dimensions defined by the EU Ethics Guidelines for Trustworthy AI. Our findings show that all seven dimensions are positively associated with user satisfaction. Yet, this relationship varies systematically across user and product types. Technical users and reviewers of AI development platforms more frequently discuss system-level concerns (e.g., transparency, data governance), while non-technical users and reviewers of end-user applications emphasize human-centric dimensions (e.g., human agency, societal well-being). Moreover, the association between ethical AI and user satisfaction is significantly stronger for non-technical users and end-user applications across all dimensions. Our results highlight the importance of ethical AI design from users' perspectives and underscore the need to account for contextual differences across user roles and product types.
AI Fairness
Abstract
Social determinants are variables that, while not directly pertaining to any specific individual, capture key aspects of contexts and environments that have direct causal influences on certain attributes of an individual. Previous algorithmic fairness literature has primarily focused on sensitive attributes, often overlooking the role of social determinants. Our paper addresses this gap by introducing formal and quantitative rigor into a space that has been shaped largely by qualitative proposals regarding the use of social determinants. To demonstrate theoretical perspectives and practical applicability, we examine a concrete setting of college admissions, using region as a proxy for social determinants. Our approach leverages a region-based analysis with Gamma distribution parameterization to model how social determinants impact individual outcomes. Despite its simplicity, our method quantitatively recovers findings that resonate with nuanced insights in previous qualitative debates, that are often missed by existing algorithmic fairness approaches. Our findings suggest that mitigation strategies centering solely around sensitive attributes may introduce new structural injustice when addressing existing discrimination. Considering both sensitive attributes and social determinants facilitates a more comprehensive explication of benefits and burdens experienced by individuals from diverse demographic backgrounds as well as contextual environments, which is essential for understanding and achieving fairness effectively and transparently.
Abstract
This study explores perceptions of fairness in algorithmic decision-making among users in Bangladesh through a comprehensive mixed-methods approach. By integrating quantitative survey data with qualitative interview insights, we examine how cultural, social, and contextual factors influence users' understanding of fairness, transparency, and accountability in AI systems. Our findings reveal nuanced attitudes toward human oversight, explanation mechanisms, and contestability, highlighting the importance of culturally aware design principles for equitable and trustworthy algorithmic systems. These insights contribute to ongoing discussions on algorithmic fairness by foregrounding perspectives from a non-Western context, thus broadening the global dialogue on ethical AI deployment.
Unsubscribe from these updates