π― Top Personalized Recommendations
Fraunhofer Institute for
AI Summary - The Stereotype Detection and Assessment component detects explicit stereotypes while remaining agnostic with regard to sensitive attributes and different text sources. [3]
- The Stereotype Detection and Assessment component detects explicit stereotypes while remaining agnostic with regard to sensitive attributes and different text sources. [3]
- The method relies on pre-trained language models, which can be biased themselves. [3]
- The pipeline is designed to detect and mitigate bias in text data, including representation bias and stereotypes. [2]
Abstract
Textual data used to train large language models (LLMs) exhibits multifaceted bias manifestations encompassing harmful language and skewed demographic distributions. Regulations such as the European AI Act require identifying and mitigating biases against protected groups in data, with the ultimate goal of preventing unfair model outputs. However, practical guidance and operationalization are lacking. We propose a comprehensive data bias detection and mitigation pipeline comprising four components that address two data bias types, namely representation bias and (explicit) stereotypes for a configurable sensitive attribute. First, we leverage LLM-generated word lists created based on quality criteria to detect relevant group labels. Second, representation bias is quantified using the Demographic Representation Score. Third, we detect and mitigate stereotypes using sociolinguistically informed filtering. Finally, we compensate representation bias through Grammar- and Context-Aware Counterfactual Data Augmentation. We conduct a two-fold evaluation using the examples of gender, religion and age. First, the effectiveness of each individual component on data debiasing is evaluated through human validation and baseline comparison. The findings demonstrate that we successfully reduce representation bias and (explicit) stereotypes in a text dataset. Second, the effect of data debiasing on model bias reduction is evaluated by bias benchmarking of several models (0.6B-8B parameters), fine-tuned on the debiased text dataset. This evaluation reveals that LLMs fine-tuned on debiased data do not consistently show improved performance on bias benchmarks, exposing critical gaps in current evaluation methodologies and highlighting the need for targeted data manipulation to address manifested model bias.
Why we think this paper is great for you:
This paper directly addresses bias in textual data, a key concern given the user's interests in data bias and AI fairness. The focus on detection and mitigation aligns with the need to understand and address potential harms caused by biased language.
East China Normal Univer
AI Summary - IFFair uses the influence function to reweight the training data and reduce bias. [3]
- Influence function: a measure of how much each data point affects the model's predictions. [3]
- Bias: the systematic error introduced by the model due to its design or training data. [3]
- The paper proposes a new method for achieving fairness in machine learning models, called IFFair. [2]
Abstract
Because machine learning has significantly improved efficiency and convenience in the society, it's increasingly used to assist or replace human decision-making. However, the data-based pattern makes related algorithms learn and even exacerbate potential bias in samples, resulting in discriminatory decisions against certain unprivileged groups, depriving them of the rights to equal treatment, thus damaging the social well-being and hindering the development of related applications. Therefore, we propose a pre-processing method IFFair based on the influence function. Compared with other fairness optimization approaches, IFFair only uses the influence disparity of training samples on different groups as a guidance to dynamically adjust the sample weights during training without modifying the network structure, data features and decision boundaries. To evaluate the validity of IFFair, we conduct experiments on multiple real-world datasets and metrics. The experimental results show that our approach mitigates bias of multiple accepted metrics in the classification setting, including demographic parity, equalized odds, equality of opportunity and error rate parity without conflicts. It also demonstrates that IFFair achieves better trade-off between multiple utility and fairness metrics compared with previous pre-processing methods.
Why we think this paper is great for you:
The paper tackles bias through a specific technique β influence functions β which is relevant to the user's interest in mitigating bias in machine learning algorithms. This approach offers a concrete method for addressing fairness issues.
CEASaclay
AI Summary - Technological solutionism: The tendency to view complex problems as solvable through technological fixes rather than addressing underlying issues. [3]
- Ethics-by-design is a powerful approach that can significantly improve the ethics readiness of AI systems. [2]
Abstract
We present Ethics Readiness Levels (ERLs), a four-level, iterative method to track how ethical reflection is implemented in the design of AI systems. ERLs bridge high-level ethical principles and everyday engineering by turning ethical values into concrete prompts, checks, and controls within real use cases. The evaluation is conducted using a dynamic, tree-like questionnaire built from context-specific indicators, ensuring relevance to the technology and application domain. Beyond being a managerial tool, ERLs help facilitate a structured dialogue between ethics experts and technical teams, while our scoring system helps track progress over time. We demonstrate the methodology through two case studies: an AI facial sketch generator for law enforcement and a collaborative industrial robot. The ERL tool effectively catalyzes concrete design changes and promotes a shift from narrow technological solutionism to a more reflective, ethics-by-design mindset.
Why we think this paper is great for you:
Given the user's interest in AI ethics, this paperβs focus on practical evaluation methods is highly relevant. It provides a framework for systematically assessing ethical considerations within AI development.
Carnegie Mellon
AI Summary - The dataset consists of 49,725 papers from various conferences related to artificial intelligence (AI) safety and ethics. [3]
- Abstract enrichment coverage: The percentage of papers with abstracts. [3]
- Keywords were generated by analyzing foundational surveys and texts in each field, with a hierarchical strategy spanning technical, theoretical, and applied domains. [2]
- The abstract enrichment coverage is 97.7%, indicating that most papers have abstracts. [1]
Abstract
While much research in artificial intelligence (AI) has focused on scaling capabilities, the accelerating pace of development makes countervailing work on producing harmless, "aligned" systems increasingly urgent. Yet research on alignment has diverged along two largely parallel tracks: safety--centered on scaled intelligence, deceptive or scheming behaviors, and existential risk--and ethics--focused on present harms, the reproduction of social bias, and flaws in production pipelines. Although both communities warn of insufficient investment in alignment, they disagree on what alignment means or ought to mean. As a result, their efforts have evolved in relative isolation, shaped by distinct methodologies, institutional homes, and disciplinary genealogies.
We present a large-scale, quantitative study showing the structural split between AI safety and AI ethics. Using a bibliometric and co-authorship network analysis of 6,442 papers from twelve major ML and NLP conferences (2020-2025), we find that over 80% of collaborations occur within either the safety or ethics communities, and cross-field connectivity is highly concentrated: roughly 5% of papers account for more than 85% of bridging links. Removing a small number of these brokers sharply increases segregation, indicating that cross-disciplinary exchange depends on a handful of actors rather than broad, distributed collaboration. These results show that the safety-ethics divide is not only conceptual but institutional, with implications for research agendas, policy, and venues. We argue that integrating technical safety work with normative ethics--via shared benchmarks, cross-institutional venues, and mixed-method methodologies--is essential for building AI systems that are both robust and just.
Why we think this paper is great for you:
This paper directly addresses the critical need to integrate safety and ethical considerations in AI research, aligning with the user's broad interest in responsible AI development.
University of Calgary
AI Summary - The study found that fairness requirements in AI are often context-dependent and tailored to specific applications rather than a universal metric. [2]
Abstract
Today, with the growing obsession with applying Artificial Intelligence (AI), particularly Machine Learning (ML), to software across various contexts, much of the focus has been on the effectiveness of AI models, often measured through common metrics such as F1- score, while fairness receives relatively little attention. This paper presents a review of existing gray literature, examining fairness requirements in AI context, with a focus on how they are defined across various application domains, managed throughout the Software Development Life Cycle (SDLC), and the causes, as well as the corresponding consequences of their violation by AI models. Our gray literature investigation shows various definitions of fairness requirements in AI systems, commonly emphasizing non-discrimination and equal treatment across different demographic and social attributes. Fairness requirement management practices vary across the SDLC, particularly in model training and bias mitigation, fairness monitoring and evaluation, and data handling practices. Fairness requirement violations are frequently linked, but not limited, to data representation bias, algorithmic and model design bias, human judgment, and evaluation and transparency gaps. The corresponding consequences include harm in a broad sense, encompassing specific professional and societal impacts as key examples, stereotype reinforcement, data and privacy risks, and loss of trust and legitimacy in AI-supported decisions. These findings emphasize the need for consistent frameworks and practices to integrate fairness into AI software, paying as much attention to fairness as to effectiveness.
Why we think this paper is great for you:
The paper's investigation into fairness requirements within AI-enabled software engineering is pertinent to the user's interest in data fairness and the broader ethical implications of AI systems.
Indian Institute of Tech
AI Summary - The proposed method is compared with several baseline methods, including kNN-SC, Ο΅-SC, kNN-FSC, Ο΅-FSC, RGC, SGL, CDC, and SFC. [3]
- The paper provides a comprehensive evaluation of the proposed method, including experiments with different values of k and Ο΅, as well as experiments with different datasets. [3]
- The paper proposes a new method for fair graph construction, which aims to reduce disparate impact in clustering algorithms. [2]
- The results show that the choice of Ξ± can significantly affect the performance of the method. [1]
Abstract
Graph clustering plays a pivotal role in unsupervised learning methods like spectral clustering, yet traditional methods for graph clustering often perpetuate bias through unfair graph constructions that may underrepresent some groups. The current research introduces novel approaches for constructing fair k-nearest neighbor (kNN) and fair epsilon-neighborhood graphs that proactively enforce demographic parity during graph formation. By incorporating fairness constraints at the earliest stage of neighborhood selection steps, our approaches incorporate proportional representation of sensitive features into the local graph structure while maintaining geometric consistency.Our work addresses a critical gap in pre-processing for fair spectral clustering, demonstrating that topological fairness in graph construction is essential for achieving equitable clustering outcomes. Widely used graph construction methods like kNN and epsilon-neighborhood graphs propagate edge based disparate impact on sensitive groups, leading to biased clustering results. Providing representation of each sensitive group in the neighborhood of every node leads to fairer spectral clustering results because the topological features of the graph naturally reflect equitable group ratios. This research fills an essential shortcoming in fair unsupervised learning, by illustrating how topological fairness in graph construction inherently facilitates fairer spectral clustering results without the need for changes to the clustering algorithm itself. Thorough experiments on three synthetic datasets, seven real-world tabular datasets, and three real-world image datasets prove that our fair graph construction methods surpass the current baselines in graph clustering tasks.
Why we think this paper is great for you:
This paper addresses bias through graph construction, a key area given the userβs interest in data representation and fairness in unsupervised learning methods.
Stanford University
AI Summary - Foundation Model: A type of artificial intelligence model that is pre-trained on a large dataset and can be fine-tuned for specific tasks. [3]
- The 2025 Foundation Model Transparency Index (FMTI) is a comprehensive framework for evaluating the transparency of foundation models. [2]
- Upstream: Refers to the development process of foundation models, including model architecture, training data, and evaluation metrics. [1]
Abstract
Foundation model developers are among the world's most important companies. As these companies become increasingly consequential, how do their transparency practices evolve? The 2025 Foundation Model Transparency Index is the third edition of an annual effort to characterize and quantify the transparency of foundation model developers. The 2025 FMTI introduces new indicators related to data acquisition, usage data, and monitoring and evaluates companies like Alibaba, DeepSeek, and xAI for the first time. The 2024 FMTI reported that transparency was improving, but the 2025 FMTI finds this progress has deteriorated: the average score out of 100 fell from 58 in 2024 to 40 in 2025. Companies are most opaque about their training data and training compute as well as the post-deployment usage and impact of their flagship models. In spite of this general trend, IBM stands out as a positive outlier, scoring 95, in contrast to the lowest scorers, xAI and Midjourney, at just 14. The five members of the Frontier Model Forum we score end up in the middle of the Index: we posit that these companies avoid reputational harms from low scores but lack incentives to be transparency leaders. As policymakers around the world increasingly mandate certain types of transparency, this work reveals the current state of transparency for foundation model developers, how it may change given newly enacted policy, and where more aggressive policy interventions are necessary to address critical information deficits.
Why we think this paper is great for you:
Given the user's interest in AI transparency, this paperβs focus on quantifying transparency practices of foundation model developers is directly relevant and valuable.
Data Representation
TIB Hannover
Abstract
For scientific knowledge to be findable, accessible, interoperable, and reusable, it needs to be machine-readable. Moving forward from post-publication extraction of knowledge, we adopted a pre-publication approach to write research findings in a machine-readable format at early stages of data analysis. For this purpose, we developed the package dtreg in Python and R. Registered and persistently identified data types, aka schemata, which dtreg applies to describe data analysis in a machine-readable format, cover the most widely used statistical tests and machine learning methods. The package supports (i) downloading a relevant schema as a mutable instance of a Python or R class, (ii) populating the instance object with metadata about data analysis, and (iii) converting the object into a lightweight Linked Data format. This paper outlines the background of our approach, explains the code architecture, and illustrates the functionality of dtreg with a machine-readable description of a t-test on Iris Data. We suggest that the dtreg package can enhance the methodological repertoire of researchers aiming to adhere to the FAIR principles.
AI Summary - Some sections of the paper seem disconnected from the main topic. [3]
- FAIR data management and stewardship Scientific workflow management Machine-readable expressions of research findings The paper could benefit from a clearer explanation of the technical aspects of machine-readable expressions of research findings. [2]
- Imagine you're working on a scientific project and you want to share your results with others. [1]
TU Dortmund University
Abstract
Craig interpolation and uniform interpolation have many applications in knowledge representation, including explainability, forgetting, modularization and reuse, and even learning. At the same time, many relevant knowledge representation formalisms do in general not have Craig or uniform interpolation, and computing interpolants in practice is challenging. We have a closer look at two prominent knowledge representation formalisms, description logics and logic programming, and discuss theoretical results and practical methods for computing interpolants.
AI Summary - Inseparability: An ontology O1 is said to be inseparable from another ontology O2 with respect to a signature Ξ£ if any concept inclusion that can be expressed in the DL using Ξ£ is entailed by either O1 or O2. [3]
- Conservative extensions: A conservative extension of an ontology O1 is an ontology O2 such that O1 β O2 and O1 β‘ L sig(O1) O2. [3]
- Uniform interpolation in description logic (DL) is a concept that has been extensively studied and applied in various areas, including knowledge representation, artificial intelligence, and computer science. [2]
- Uniform interpolation can be used for various applications, such as computing conservative extensions, logical differences, and modules. [1]
AI Transparency
ChungAng University
Abstract
Graph neural networks (GNNs) are increasingly used to model complex patterns in graph-structured data. However, enabling them to "forget" designated information remains challenging, especially under privacy regulations such as the GDPR. Existing unlearning methods largely optimize for efficiency and scalability, yet they offer little transparency, and the black-box nature of GNNs makes it difficult to verify whether forgetting has truly occurred. We propose an explainability-driven verifier for GNN unlearning that snapshots the model before and after deletion, using attribution shifts and localized structural changes (for example, graph edit distance) as transparent evidence. The verifier uses five explainability metrics: residual attribution, heatmap shift, explainability score deviation, graph edit distance, and a diagnostic graph rule shift. We evaluate two backbones (GCN, GAT) and four unlearning strategies (Retrain, GraphEditor, GNNDelete, IDEA) across five benchmarks (Cora, Citeseer, Pubmed, Coauthor-CS, Coauthor-Physics). Results show that Retrain and GNNDelete achieve near-complete forgetting, GraphEditor provides partial erasure, and IDEA leaves residual signals. These explanation deltas provide the primary, human-readable evidence of forgetting; we also report membership-inference ROC-AUC as a complementary, graph-wide privacy signal.
AI Summary - The paper proposes a method for transparent verification of Graph Neural Network (GNN) unlearning, which is essential for ensuring data privacy and compliance with regulations such as the General Data Protection Regulation (GDPR). [3]
- The authors use publicly available graph benchmarks and do not involve collecting or processing personal data, human-subjects research, or deployment to end-users. [3]
- The paper discusses the importance of explainability in GNNs and proposes a framework for generating proxy graphs that can be used to explain the behavior of GNNs. [3]
- Graph Neural Network (GNN): A type of neural network designed to handle graph-structured data. [3]
- GNNs are widely used in various applications, including social network analysis, recommendation systems, and traffic prediction. [3]
- Unlearning: The process of removing or updating the knowledge learned by a machine learning model when new information becomes available that contradicts the existing knowledge. [3]
- Unlearning is essential for ensuring data privacy and compliance with regulations such as GDPR. [3]
- The paper proposes a method for transparent verification of GNN unlearning, which is essential for ensuring data privacy and compliance with regulations such as GDPR. [3]
- The authors use publicly available graph benchmarks and do not involve collecting or processing personal data, human-subjects research, or deployment to end-users. [3]
- The authors use publicly available graph benchmarks, but it is unclear whether these benchmarks are representative of real-world datasets. [3]
- The paper discusses the importance of explainability in GNNs and proposes a framework for generating proxy graphs that can be used to explain the behavior of GNNs. [3]
- This framework is based on the idea of generating proxy graphs that are similar to the original graph but with some modifications to make them more interpretable. [3]
- The paper proposes a method for transparent verification of Graph Neural Network (GNN) unlearning, which is essential for ensuring data privacy and compliance with regulations such as the General Data Protection Regulation (GDPR). [3]
JP Morgan
Abstract
As AI agents increasingly operate in real-world, multi-agent environments, ensuring reliable and context-aware privacy in agent communication is critical, especially to comply with evolving regulatory requirements. Traditional access controls are insufficient, as privacy risks often arise after access is granted; agents may use information in ways that compromise privacy, such as messaging humans, sharing context with other agents, making tool calls, persisting data, or generating derived private information. Existing approaches often treat privacy as a binary constraint, whether data is shareable or not, overlooking nuanced, role-specific, and computation-dependent privacy needs essential for regulatory compliance.
Agents, including those based on large language models, are inherently probabilistic and heuristic. There is no formal guarantee of how an agent will behave for any query, making them ill-suited for operations critical to security. To address this, we introduce AgentCrypt, a four-tiered framework for fine-grained, encrypted agent communication that adds a protection layer atop any AI agent platform. AgentCrypt spans unrestricted data exchange (Level 1) to fully encrypted computation using techniques such as homomorphic encryption (Level 4). Crucially, it guarantees the privacy of tagged data is always maintained, prioritizing privacy above correctness.
AgentCrypt ensures privacy across diverse interactions and enables computation on otherwise inaccessible data, overcoming barriers such as data silos. We implemented and tested it with Langgraph and Google ADK, demonstrating versatility across platforms. We also introduce a benchmark dataset simulating privacy-critical tasks at all privacy levels, enabling systematic evaluation and fostering the development of regulatable machine learning systems for secure agent communication and computation.