Hi!

Your personalized paper recommendations for 17 to 21 November, 2025.
🎯 Top Personalized Recommendations
FIU
Why we think this paper is great for you:
This paper directly addresses the crucial challenge of achieving fairness in AI systems, especially when complete demographic data is unavailable. It explores methods to mitigate discriminatory outcomes, which is highly relevant to your focus.
Abstract
Fairness in artificial intelligence (AI) has become a growing concern due to discriminatory outcomes in AI-based decision-making systems. While various methods have been proposed to mitigate bias, most rely on complete demographic information, an assumption often impractical due to legal constraints and the risk of reinforcing discrimination. This survey examines fairness in AI when demographics are incomplete, addressing the gap between traditional approaches and real-world challenges. We introduce a novel taxonomy of fairness notions in this setting, clarifying their relationships and distinctions. Additionally, we summarize existing techniques that promote fairness beyond complete demographics and highlight open research questions to encourage further progress in the field.
AI Summary
  • The majority of existing AI fairness methods are impractical in real-world scenarios due to their reliance on complete demographic information, which is often unavailable due to privacy, legal, or individual choice constraints. [2]
  • A novel taxonomy categorizes fairness notions for incomplete demographics into Rawlsian, Group, Counterfactual, Proxy, Individual, and Unawareness, distinguishing them by protection level (group/individual) and reliance on demographic data (explicit, latent proxy, or demographic-free). [2]
  • Rawlsian fairness and certain adversarial learning approaches can address bias without explicit demographic information by focusing on worst-case performance or computationally identifiable error groups, though they may not guarantee statistical parity for specific demographic groups. [2]
  • Proxy fairness methods, which infer or approximate demographic information from correlated features, are critical for practical applications but face challenges in ensuring the accuracy and representativeness of these proxies. [2]
  • Individual fairness, based on similarity metrics, offers a demographic-free approach but struggles with defining appropriate similarity functions and can inadvertently create subgroup disparities. [2]
  • Leveraging partial demographic information through uncertainty-aware attribute classifiers or disentanglement frameworks significantly enhances the accuracy of proxy demographics and fairness enforcement in demographic-scarce regimes. [2]
  • Third-party involvement can facilitate fairness assessment or data preprocessing, but introduces significant challenges related to trust, data security, verification, and administrative costs. [2]
  • Rawlsian Fairness: A fairness notion based on John Rawls’s difference principle, aiming to improve the well-being of the least advantaged group by minimizing the variance of subgroup utilities, often without direct demographic information. [2]
  • Proxy Fairness: A concept where fairness is measured or enforced using substitute or inferred demographic information (e.g., correlated features, predicted labels) when true demographic data is unavailable. [2]
  • Individual Fairness (Lipschitz Condition): Ensures that similar individuals receive similar predictions, formalized as D(f(xi), f(xj)) ≀ L D'(xi, xj), where D' is input space distance and D is output space distance. [2]
University of St Gallen
Why we think this paper is great for you:
This research delves into the complex interplay between human and AI biases, particularly how data imbalance affects decision-making. Understanding this interaction is key to developing more ethical and fair AI systems.
Paper visualization
Rate image: πŸ‘ πŸ‘Ž
Abstract
Humans increasingly interact with artificial intelligence (AI) in decision-making. However, both AI and humans are prone to biases. While AI and human biases have been studied extensively in isolation, this paper examines their complex interaction. Specifically, we examined how class imbalance as an AI bias affects people's ability to appropriately rely on an AI-based decision-support system, and how it interacts with base rate neglect as a human bias. In a within-subject online study (N= 46), participants classified three diseases using an AI-based decision-support system trained on either a balanced or unbalanced dataset. We found that class imbalance disrupted participants' calibration of AI reliance. Moreover, we observed mutually reinforcing effects between class imbalance and base rate neglect, offering evidence of a compound human-AI bias. Based on these findings, we advocate for an interactionist perspective and further research into the mutually reinforcing effects of biases in human-AI interaction.
University of Notre Dame
Why we think this paper is great for you:
This paper presents a technical approach to ensuring model fairness and mitigating bias in deep learning applications. You will find its methods for achieving fairness while maintaining performance particularly insightful.
Paper visualization
Rate image: πŸ‘ πŸ‘Ž
Abstract
As deep learning (DL) techniques become integral to various applications, ensuring model fairness while maintaining high performance has become increasingly critical, particularly in sensitive fields such as medical diagnosis. Although a variety of bias-mitigation methods have been proposed, many rely on computationally expensive debiasing strategies or suffer substantial drops in model accuracy, which limits their practicality in real-world, resource-constrained settings. To address this issue, we propose a fairness-oriented low rank factorization (LRF) framework that leverages singular value decomposition (SVD) to improve DL model fairness. Unlike traditional SVD, which is mainly used for model compression by decomposing and reducing weight matrices, our work shows that SVD can also serve as an effective tool for fairness enhancement. Specifically, we observed that elements in the unitary matrices obtained from SVD contribute unequally to model bias across groups defined by sensitive attributes. Motivated by this observation, we propose a method, named FairLRF, that selectively removes bias-inducing elements from unitary matrices to reduce group disparities, thus enhancing model fairness. Extensive experiments show that our method outperforms conventional LRF methods as well as state-of-the-art fairness-enhancing techniques. Additionally, an ablation study examines how major hyper-parameters may influence the performance of processed models. To the best of our knowledge, this is the first work utilizing SVD not primarily for compression but for fairness enhancement.
Memorial Sloan Kettering
Why we think this paper is great for you:
This work examines the ethical considerations and governance required for responsible data stewardship, especially in the sensitive domain of healthcare. It offers valuable insights into the evolving landscape of data ethics.
Paper visualization
Rate image: πŸ‘ πŸ‘Ž
Abstract
Healthcare stands at a critical crossroads. Artificial Intelligence and modern computing are unlocking opportunities, yet their value lies in the data that fuels them. The value of healthcare data is no longer limited to individual patients. However, data stewardship and governance has not kept pace, and privacy-centric policies are hindering both innovation and patient protections. As healthcare moves toward a data-driven future, we must define reformed data stewardship that prioritizes patients' interests by proactively managing modern risks and opportunities while addressing key challenges in cost, efficacy, and accessibility. Current healthcare data policies are rooted in 20th-century legislation shaped by outdated understandings of data-prioritizing perceived privacy over innovation and inclusion. While other industries thrive in a data-driven era, the evolution of medicine remains constrained by regulations that impose social rather than scientific boundaries. Large-scale aggregation is happening, but within opaque, closed systems. As we continue to uphold foundational ethical principles - autonomy, beneficence, nonmaleficence, and justice - there is a growing imperative to acknowledge they exist in evolving technological, social, and cultural realities. Ethical principles should facilitate, rather than obstruct, dialogue on adapting to meet opportunities and address constraints in medical practice and healthcare delivery. The new ethics of data stewardship places patients first by defining governance that adapts to changing landscapes. It also rejects the legacy of treating perceived privacy as an unquestionable, guiding principle. By proactively redefining data stewardship norms, we can drive an era of medicine that promotes innovation, protects patients, and advances equity - ensuring future generations advance medical discovery and care.
UC Davis
Why we think this paper is great for you:
This study provides a thorough examination of biases present in large language models, highlighting their impact on fair outputs. It offers a comprehensive view of how biases manifest and the need for mitigation.
Paper visualization
Rate image: πŸ‘ πŸ‘Ž
Abstract
Large Language Models (LLMs) inherit explicit and implicit biases from their training datasets. Identifying and mitigating biases in LLMs is crucial to ensure fair outputs, as they can perpetuate harmful stereotypes and misinformation. This study highlights the need to address biases in LLMs amid growing generative AI. We studied bias-specific benchmarks such as StereoSet and CrowSPairs to evaluate the existence of various biases in multiple generative models such as BERT and GPT 3.5. We proposed an automated Bias-Identification Framework to recognize various social biases in LLMs such as gender, race, profession, and religion. We adopted a two-pronged approach to detect explicit and implicit biases in text data. Results indicated fine-tuned models struggle with gender biases but excelled at identifying and avoiding racial biases. Our findings illustrated that despite having some success, LLMs often over-relied on keywords. To illuminate the capability of the analyzed LLMs in detecting implicit biases, we employed Bag-of-Words analysis and unveiled indications of implicit stereotyping within the vocabulary. To bolster the model performance, we applied an enhancement strategy involving fine-tuning models using prompting techniques and data augmentation of the bias benchmarks. The fine-tuned models exhibited promising adaptability during cross-dataset testing and significantly enhanced performance on implicit bias benchmarks, with performance gains of up to 20%.
UC Berkeley
Why we think this paper is great for you:
This paper offers a foundational perspective on the ethical and responsible deployment of artificial intelligence. It encourages a thoughtful approach to integrating AI into daily life.
Abstract
Artificial intelligence (AI) is no longer futuristic; it is a daily companion shaping our private and work lives. While AI simplifies our lives, its rise also invites us to rethink who we are - and who we wish to remain - as humans. Even if AI does not think, feel, or desire, it learns from our behavior, mirroring our collective values, biases, and aspirations. The question, then, is not what AI is, but what we are allowing it to become through data, computing power, and other parameters "teaching" it - and, even more importantly, who we are becoming through our relationship with AI. As the EU AI Act and the Vienna Manifesto on Digital Humanism emphasize, technology must serve human dignity,social well-being, and democratic accountability. In our opinion, responsible use of AI is not only a matter of code nor law, but also of conscientious practice: how each of us engages and teaches others to use AI at home and at work. We propose Ten Commandments for the Wise and Responsible Use of AI are meant as guideline for this very engagement. They closely align with Floridi and Cowls' five guiding principles for AI in society - beneficence, non-maleficence, autonomy, justice, and explicability.
Colorado School of Mines
Why we think this paper is great for you:
This research investigates the practical implementation and effectiveness of privacy transparency mechanisms in real-world applications. It provides insights into how data practices are communicated to users.
Paper visualization
Rate image: πŸ‘ πŸ‘Ž
Abstract
With the requirements and emphases on privacy transparency placed by regulations such as GDPR and CCPA, the Google Play Store requires Android developers to more responsibly communicate their apps' privacy practices to potential users by providing the proper information via the data safety, privacy policy, and permission manifest privacy transparency channels. However, it is unclear how effective those channels are in helping users make informed decisions in the app selection and installation process. In this article, we conducted a study for 190 participants to interact with our simulated privacy transparency channels of mobile apps. We quantitatively analyzed (supplemented by qualitative analysis) participants' responses to five sets of questions. We found that data safety provides the most intuitive user interfaces, privacy policy is most informative and effective, while permission manifest excels at raising participants' concerns about an app's overall privacy risks. These channels complement each other and should all be improved.
Data Bias
University of Bayreuth
Abstract
Compared to nonparametric estimators in the multivariate setting, kernel estimators for functional data models have a larger order of bias. This is problematic for constructing confidence regions or statistical tests since the bias might not be negligible. It stems from the fact that one sided kernels are used where already the first moment of the kernel is different from 0. It cannot be cured by assuming the existence of higher order derivatives. In the following, we propose bias corrected estimators based on the idea in \cite{Cheng2018} which still have an appealing structure, but have a bias of smaller order as in multiple regression settings while the variance is of the same order of magnitude as before. In addition we show asymptotic normality of such estimators and derive uniform rates. The performance of the estimator in finite samples is in addition checked in a simulation study.
AI Ethics
Huawei
Abstract
As a capability coming from computation, how does AI differ fundamentally from the capabilities delivered by rule-based software program? The paper examines the behavior of artificial intelligence (AI) from engineering points of view to clarify its nature and limits. The paper argues that the rationality underlying humanity's impulse to pursue, articulate, and adhere to rules deserves to be valued and preserved. Identifying where rule-based practical rationality ends is the beginning of making it aware until action. Although the rules of AI behaviors are still hidden or only weakly observable, the paper has proposed a methodology to make a sense of discrimination possible and practical to identify the distinctions of the behavior of AI models with three types of decisions. It is a prerequisite for human responsibilities with alternative possibilities, considering how and when to use AI. It would be a solid start for people to ensure AI system soundness for the well-being of humans, society, and the environment.
AI Fairness
Cornell University
Abstract
Enforcing a fair workload allocation among multiple agents tasked to achieve an objective in learning enabled demand side healthcare worker settings is crucial for consistent and reliable performance at runtime. Existing multi-agent reinforcement learning (MARL) approaches steer fairness by shaping reward through post hoc orchestrations, leaving no certifiable self-enforceable fairness that is immutable by individual agents at runtime. Contextualized within a setting where each agent shares resources with others, we address this shortcoming with a learning enabled optimization scheme among self-interested decision makers whose individual actions affect those of other agents. This extends the problem to a generalized Nash equilibrium (GNE) game-theoretic framework where we steer group policy to a safe and locally efficient equilibrium, so that no agent can improve its utility function by unilaterally changing its decisions. Fair-GNE models MARL as a constrained generalized Nash equilibrium-seeking (GNE) game, prescribing an ideal equitable collective equilibrium within the problem's natural fabric. Our hypothesis is rigorously evaluated in our custom-designed high-fidelity resuscitation simulator. Across all our numerical experiments, Fair-GNE achieves significant improvement in workload balance over fixed-penalty baselines (0.89 vs.\ 0.33 JFI, $p < 0.01$) while maintaining 86\% task success, demonstrating statistically significant fairness gains through adaptive constraint enforcement. Our results communicate our formulations, evaluation metrics, and equilibrium-seeking innovations in large multi-agent learning-based healthcare systems with clarity and principled fairness enforcement.
Data Representation
Asteromorph
Abstract
Modern data processing workflows frequently encounter ragged data: collections with variable-length elements that arise naturally in domains like natural language processing, scientific measurements, and autonomous AI agents. Existing workflow engines lack native support for tracking the shapes and dependencies inherent to ragged data, forcing users to manage complex indexing and dependency bookkeeping manually. We present Operon, a Rust-based workflow engine that addresses these challenges through a novel formalism of named dimensions with explicit dependency relations. Operon provides a domain-specific language where users declare pipelines with dimension annotations that are statically verified for correctness, while the runtime system dynamically schedules tasks as data shapes are incrementally discovered during execution. We formalize the mathematical foundation for reasoning about partial shapes and prove that Operon's incremental construction algorithm guarantees deterministic and confluent execution in parallel settings. The system's explicit modeling of partially-known states enables robust persistence and recovery mechanisms, while its per-task multi-queue architecture achieves efficient parallelism across heterogeneous task types. Empirical evaluation demonstrates that Operon outperforms an existing workflow engine with 14.94x baseline overhead reduction while maintaining near-linear end-to-end output rates as workloads scale, making it particularly suitable for large-scale data generation pipelines in machine learning applications.
Monash University
Abstract
We investigate string graphs through the lens of graph product structure theory, which describes complicated graphs as subgraphs of strong products of simpler building blocks. A graph $G$ is called a string graph if its vertices can be represented by a collection $\mathcal{C}$ of continuous curves (called a string representation of $G$) in a surface so that two vertices are adjacent in $G$ if and only if the corresponding curves in $\mathcal{C}$ cross. We prove that every string graph with bounded maximum degree in a fixed surface is isomorphic to a subgraph of the strong product of a graph with bounded treewidth and a path. This extends recent product structure theorems for string graphs. Applications of this result are presented. This product structure theorem ceases to be true if the `bounded maximum degree' assumption is relaxed to `bounded degeneracy'. For string graphs in the plane, we give an alternative proof of this result. Specifically, we show that every string graph in the plane has a `localised' string representation where the number of crossing points on the curve representing a vertex $u$ is bounded by a function of the degree of $u$. Our proof of the product structure theorem also leads to a result about the treewidth of outerstring graphs, which qualitatively extends a result of Fox and Pach [Eur. J. Comb. 2012] about outerstring graphs with bounded maximum degree. We extend our result to outerstring graphs defined in arbitrary surfaces.
Data Transparency
Fudan University
Abstract
Split inference (SI) enables users to access deep learning (DL) services without directly transmitting raw data. However, recent studies reveal that data reconstruction attacks (DRAs) can recover the original inputs from the smashed data sent from the client to the server, leading to significant privacy leakage. While various defenses have been proposed, they often result in substantial utility degradation, particularly when the client-side model is shallow. We identify a key cause of this trade-off: existing defenses apply excessive perturbation to redundant information in the smashed data. To address this issue in computer vision tasks, we propose InfoDecom, a defense framework that first decomposes and removes redundant information and then injects noise calibrated to provide theoretically guaranteed privacy. Experiments demonstrate that InfoDecom achieves a superior utility-privacy trade-off compared to existing baselines. The code and the appendix are available at https://github.com/SASA-cloud/InfoDecom.
Data Ethics
University of Washington
Abstract
Rigorous valuation of individual data sources is critical for fair compensation in data markets, informed data acquisition, and transparent development of ML/AI models. Classical Data Shapley (DS) provides a essential axiomatic framework for data valuation but is constrained by its symmetry axiom that assumes interchangeability of data sources. This assumption fails to capture the directional and temporal dependencies prevalent in modern ML/AI workflows, including the reliance of duplicated or augmented data on original sources and the order-specific contributions in sequential pipelines such as federated learning and multi-stage LLM fine tuning. To address these limitations, we introduce Asymmetric Data Shapley (ADS), a structure-aware data valuation framework for modern ML/AI pipelines. ADS relaxes symmetry by averaging marginal contributions only over permutations consistent with an application-specific ordering of data groups. It preserves efficiency and linearity, maintains within group symmetry and directional precedence across groups, and reduces to DS when the ordering collapses to a single group. We develop two complementary computational procedures for ADS: (i) a Monte Carlo estimator (MC-ADS) with finite-sample accuracy guarantees, and (ii) a k-nearest neighbor surrogate (KNN-ADS) that is exact and efficient for KNN predictors. Across representative settings with directional and temporal dependence, ADS consistently outperforms benchmark methods by distinguishing novel from redundant contributions and respecting the sequential nature of training. These results establish ADS as a principled and practical approach to equitable data valuation in data markets and complex ML/AI pipelines.

Interests not found

We did not find any papers that match the below interests. Try other terms also consider if the content exists in arxiv.org.
  • AI Transparency
You can edit or add more interests any time.