Hi!

Your personalized paper recommendations for 10 to 14 November, 2025.
AI Energy Consumption
Paper visualization
Rate image: πŸ‘ πŸ‘Ž
Abstract
Large language model (LLM) queries are predominantly processed by frontier models in centralized cloud infrastructure. Rapidly growing demand strains this paradigm, and cloud providers struggle to scale infrastructure at pace. Two advances enable us to rethink this paradigm: small LMs (<=20B active parameters) now achieve competitive performance to frontier models on many tasks, and local accelerators (e.g., Apple M4 Max) run these models at interactive latencies. This raises the question: can local inference viably redistribute demand from centralized infrastructure? Answering this requires measuring whether local LMs can accurately answer real-world queries and whether they can do so efficiently enough to be practical on power-constrained devices (i.e., laptops). We propose intelligence per watt (IPW), task accuracy divided by unit of power, as a metric for assessing capability and efficiency of local inference across model-accelerator pairs. We conduct a large-scale empirical study across 20+ state-of-the-art local LMs, 8 accelerators, and a representative subset of LLM traffic: 1M real-world single-turn chat and reasoning queries. For each query, we measure accuracy, energy, latency, and power. Our analysis reveals $3$ findings. First, local LMs can accurately answer 88.7% of single-turn chat and reasoning queries with accuracy varying by domain. Second, from 2023-2025, IPW improved 5.3x and local query coverage rose from 23.2% to 71.3%. Third, local accelerators achieve at least 1.4x lower IPW than cloud accelerators running identical models, revealing significant headroom for optimization. These findings demonstrate that local inference can meaningfully redistribute demand from centralized infrastructure, with IPW serving as the critical metric for tracking this transition. We release our IPW profiling harness for systematic intelligence-per-watt benchmarking.
AI Summary
  • Intelligence per watt (IPW) for local inference improved 5.3Γ— from 2023-2025, driven by 3.1Γ— algorithmic advances and 1.7Γ— accelerator improvements, increasing locally-serviceable query coverage from 23.2% to 71.3%. [3]
  • Local accelerators currently exhibit 1.4Γ— to 1.78Γ— lower intelligence per watt and 1.6Γ— to 7.4Γ— lower intelligence per joule compared to cloud accelerators for identical workloads, indicating significant headroom for on-device optimization. [3]
  • Rapid expansion of local accelerator memory capacity (e.g., Apple Silicon's 128-512 GB unified memory) has been a primary enabler for deploying increasingly capable 8-20B parameter models locally. [3]
  • Intelligence per watt (IPW): A unified metric defined as task accuracy divided by unit of power consumption, assessing both the capability and efficiency of local inference. [3]
  • Local LMs can successfully answer 88.7% of single-turn chat and reasoning queries, with domain-specific accuracy ranging from over 90% for creative tasks to 68% for technical fields. [2]
  • Intelligent routing of queries between local and cloud LMs can achieve 60-80% reductions in energy, compute, and cost compared to cloud-only deployment, even with realistic (80%) routing accuracy. [2]
  • Model diversity significantly boosts local inference coverage, with routing to the best local LM achieving 88.7% overall query coverage, a 16.3 percentage point improvement over the best individual local model. [2]
  • The paper introduces and releases a hardware-agnostic IPW profiling harness to facilitate reproducible efficiency benchmarking for evolving local LMs and accelerators. [2]
  • Local LMs: Small language models with ≀20B active parameters, capable of running on local accelerators. [2]
  • Frontier LMs: Large language models with β‰₯100B parameters, typically deployed in centralized cloud infrastructure. [2]
University of Mississippi
Abstract
Conventional wisdom holds that a misaligned artificial superintelligence (ASI) will destroy humanity. But the problem of constraining a powerful agent is not new. I apply classic economic logic of interjurisdictional competition, all-encompassing interest, and trading on credit to the threat of misaligned ASI. Using a simple model, I show that an acquisitive ASI refrains from full predation under surprisingly weak conditions. When humans can flee to rivals, inter-ASI competition creates a market that tempers predation. When trapped by a monopolist ASI, its "encompassing interest" in humanity's output makes it a rational autocrat rather than a ravager. And when the ASI has no long-term stake, our ability to withhold future output incentivizes it to trade on credit rather than steal. In each extension, humanity's welfare progressively worsens. But each case suggests that catastrophe is not a foregone conclusion. The dismal science, ironically, offers an optimistic take on our superintelligent future.
AI on Education
Prague Business School
Paper visualization
Rate image: πŸ‘ πŸ‘Ž
Abstract
This paper focuses on assessing the potential of micro-credentials in digitalized higher education in the era of artificial intelligence. Micro-credentials are commonly described as mini qualifications or digital badges that certify an individual competency in a specific skill or cluster of skills. Being different from traditional academic credentials in both their granularity and focus, micro-credentials represent a new and exciting topic that is rapidly gaining popularity all around the world both within higher educational institutions, such as public and private universities, as well as within other education providers, such as non-governmental organizations and SMEs. The popularity of micro-credentials has been further enhanced by the recent COVID-19 pandemic which caused digital surge in higher education leading to many rapid technological innovations and changes that were unthinkable before. The rising interest in micro-credentialing can be best demonstrated by the increase of the number of scientific publications on this topic from just 1 in 1992 to 165 in 2024. The paper employs a comprehensive bibliometric network analysis using the terms micro-credentials based on a sample of 608 selected publications indexed in the Scopus database. It carries out the network cluster analysis using both text data as well as bibliometric data with the help of VOSviewer software. The results and outcomes of this research might be helpful for researchers, stakeholders, and policymakers in devising effective strategies and policies for transforming the future digitalized AI-driven higher education.
Freie Universitt Berlin
Abstract
This study introduces 'Malinowski's Lens', the first AI-native educational game for anthropology that transforms Bronislaw Malinowski's 'Argonauts of the Western Pacific' (1922) into an interactive learning experience. The system combines Retrieval-Augmented Generation with DALL-E 3 text-to-image generation, creating consistent VGA-style visuals as players embody Malinowski during his Trobriand Islands fieldwork (1915-1918). To address ethical concerns, indigenous peoples appear as silhouettes while Malinowski is detailed, prompting reflection on anthropological representation. Two validation studies confirmed effectiveness: Study 1 with 10 non-specialists showed strong learning outcomes (average quiz score 7.5/10) and excellent usability (SUS: 83/100). Study 2 with 4 expert anthropologists confirmed pedagogical value, with one senior researcher discovering "new aspects" of Malinowski's work through gameplay. The findings demonstrate that AI-driven educational games can effectively convey complex anthropological concepts while sparking disciplinary curiosity. This study advances AI-native educational game design and provides a replicable model for transforming academic texts into engaging interactive experiences.
AI for Social Justice
University at Buffalo
Abstract
Scholars investigating ethical AI, especially in high stakes settings like child welfare, have arguably been seeking ways to embed notions of justice into the design of these critical technologies. These efforts often operationalize justice at the upper and lower bounds of its continuum, defining it in terms of progressiveness or reform. Before characterizing the type of justice an AI tool should have baked in, we argue for a systematic discovery of how justice is executed by the recipient system: a method the Value Sensitive Design (VSD) framework terms Value Source analysis. The present work asks: how is justice operationalized within current child welfare administrative policy and what does it teach us about how to develop AI? We conduct a mixed-methods analysis of child welfare policy in the state of New York and find a range of functional definitions of justice (which we term principles). These principles reflect more nuanced understandings of justice across a spectrum of contexts: from established concepts like fairness and equity to less common foci like the proprietary rights of parents and children. Our work contributes to a deeper understanding of the interplay between AI and policy, highlighting the importance of operationalized values in adjudicating our development of ethical design requirements for high stakes decision settings.
YUX Design
Abstract
Frontier LLMs are optimised around high-resource assumptions about language, knowledge, devices, and connectivity. Whilst widely accessible, they often misfit conditions in the Global South. As a result, users must often perform additional work to make these systems usable. We term this alignment debt: the user-side burden that arises when AI systems fail to align with cultural, linguistic, infrastructural, or epistemic contexts. We develop and validate a four-part taxonomy of alignment debt through a survey of 411 AI users in Kenya and Nigeria. Among respondents measurable on this taxonomy (n = 385), prevalence is: Cultural and Linguistic (51.9%), Infrastructural (43.1%), Epistemic (33.8%), and Interaction (14.0%). Country comparisons show a divergence in Infrastructural and Interaction debt, challenging one-size-fits-Africa assumptions. Alignment debt is associated with compensatory labour, but responses vary by debt type: users facing Epistemic challenges verify outputs at significantly higher rates (91.5% vs. 80.8%; p = 0.037), and verification intensity correlates with cumulative debt burden (Spearmans rho = 0.147, p = 0.004). In contrast, Infrastructural and Interaction debts show weak or null associations with verification, indicating that some forms of misalignment cannot be resolved through verification alone. These findings show that fairness must be judged not only by model metrics but also by the burden imposed on users at the margins, compelling context-aware safeguards that alleviate alignment debt in Global South settings. The alignment debt framework provides an empirically grounded way to measure user burden, informing both design practice and emerging African AI governance efforts.
AI on Energy
Hochschule Mnchen
Paper visualization
Rate image: πŸ‘ πŸ‘Ž
Abstract
This work will elaborate the fundamental principles of physical artificial intelligence (Physical AI) from a scientific and systemic perspective. The aim is to create a theoretical foundation that describes the physical embodiment, sensory perception, ability to act, learning processes, and context sensitivity of intelligent systems within a coherent framework. While classical AI approaches rely on symbolic processing and data driven models, Physical AI understands intelligence as an emergent phenomenon of real interaction between body, environment, and experience. The six fundamentals presented here are embodiment, sensory perception, motor action, learning, autonomy, and context sensitivity, and form the conceptual basis for designing and evaluating physically intelligent systems. Theoretically, it is shown that these six principles do not represent loose functional modules but rather act as a closed control loop in which energy, information, control, and context are in constant interaction. This circular interaction enables a system to generate meaning not from databases, but from physical experience, a paradigm shift that understands intelligence as an physical embodied process. Physical AI understands learning not as parameter adjustment, but as a change in the structural coupling between agents and the environment. To illustrate this, the theoretical model is explained using a practical scenario: An adaptive assistant robot supports patients in a rehabilitation clinic. This example illustrates that physical intelligence does not arise from abstract calculation, but from immediate, embodied experience. It shows how the six fundamentals interact in a real system: embodiment as a prerequisite, perception as input, movement as expression, learning as adaptation, autonomy as regulation, and context as orientation.
AI on Healthcare
UIUC
Paper visualization
Rate image: πŸ‘ πŸ‘Ž
Abstract
Limited English proficiency (LEP) patients in the U.S. face systemic barriers to healthcare beyond language and interpreter access, encompassing procedural and institutional constraints. AI advances may support communication and care through on-demand translation and visit preparation, but also risk exacerbating existing inequalities. We conducted storyboard-driven interviews with 14 patient navigators to explore how AI could shape care experiences for Spanish-speaking LEP individuals. We identified tensions around linguistic and cultural misunderstandings, privacy concerns, and opportunities and risks for AI to augment care workflows. Participants highlighted structural factors that can undermine trust in AI systems, including sensitive information disclosure, unstable technology access, and low digital literacy. While AI tools can potentially alleviate social barriers and institutional constraints, there are risks of misinformation and uprooting human camaraderie. Our findings contribute design considerations for AI that support LEP patients and care teams via rapport-building, education, and language support, and minimizing disruptions to existing practices.
HK PolyU
Abstract
Artificial intelligence has shown promise in medical imaging, yet most existing systems lack flexibility, interpretability, and adaptability - challenges especially pronounced in ophthalmology, where diverse imaging modalities are essential. We present EyeAgent, the first agentic AI framework for comprehensive and interpretable clinical decision support in ophthalmology. Using a large language model (DeepSeek-V3) as its central reasoning engine, EyeAgent interprets user queries and dynamically orchestrates 53 validated ophthalmic tools across 23 imaging modalities for diverse tasks including classification, segmentation, detection, image/report generation, and quantitative analysis. Stepwise ablation analysis demonstrated a progressive improvement in diagnostic accuracy, rising from a baseline of 69.71% (using only 5 general tools) to 80.79% when the full suite of 53 specialized tools was integrated. In an expert rating study on 200 real-world clinical cases, EyeAgent achieved 93.7% tool selection accuracy and received expert ratings of more than 88% across accuracy, completeness, safety, reasoning, and interpretability. In human-AI collaboration, EyeAgent matched or exceeded the performance of senior ophthalmologists and, when used as an assistant, improved overall diagnostic accuracy by 18.51% and report quality scores by 19%, with the greatest benefit observed among junior ophthalmologists. These findings establish EyeAgent as a scalable and trustworthy AI framework for ophthalmology and provide a blueprint for modular, multimodal, and clinically aligned next-generation AI systems.
AI for Social Equality
KIT
Abstract
Collaboration with artificial intelligence (AI) has improved human decision-making across various domains by leveraging the complementary capabilities of humans and AI. Yet, humans systematically overrely on AI advice, even when their independent judgment would yield superior outcomes, fundamentally undermining the potential of human-AI complementarity. Building on prior work, we identify prevailing incentive structures in human-AI decision-making as a structural driver of this overreliance. To address this misalignment, we propose an alternative incentive mechanism designed to counteract systemic overreliance. We empirically evaluate this approach through a behavioral experiment with 180 participants, finding that the proposed mechanism significantly reduces overreliance. We also show that while appropriately designed incentives can enhance collaboration and decision quality, poorly designed incentives may distort behavior, introduce unintended consequences, and ultimately degrade performance. These findings underscore the importance of aligning incentives with task context and human-AI complementarities, and suggest that effective collaboration requires a shift toward context-sensitive incentive design.
AI on Labor Market
Dartmouth College
Paper visualization
Rate image: πŸ‘ πŸ‘Ž
Abstract
Large language models (LLMs) like ChatGPT have significantly lowered the cost of producing written content. This paper studies how LLMs, through lowering writing costs, disrupt markets that traditionally relied on writing as a costly signal of quality (e.g., job applications, college essays). Using data from Freelancer.com, a major digital labor platform, we explore the effects of LLMs' disruption of labor market signaling on equilibrium market outcomes. We develop a novel LLM-based measure to quantify the extent to which an application is tailored to a given job posting. Taking the measure to the data, we find that employers have a high willingness to pay for workers with more customized applications in the period before LLMs are introduced, but not after. To isolate and quantify the effect of LLMs' disruption of signaling on equilibrium outcomes, we develop and estimate a structural model of labor market signaling, in which workers invest costly effort to produce noisy signals that predict their ability in equilibrium. We use the estimated model to simulate a counterfactual equilibrium in which LLMs render written applications useless in signaling workers' ability. Without costly signaling, employers are less able to identify high-ability workers, causing the market to become significantly less meritocratic: compared to the pre-LLM equilibrium, workers in the top quintile of the ability distribution are hired 19% less often, workers in the bottom quintile are hired 14% more often.
AI Water Consumption
University of Potsdam
Abstract
Across the Artificial Intelligence (AI) lifecycle - from hardware to development, deployment, and reuse - burdens span energy, carbon, water, and embodied impacts. Cloud provider tools improve transparency but remain heterogeneous and often omit water and value chain effects, limiting comparability and reproducibility. Addressing these multi dimensional burdens requires a lifecycle approach linking phase explicit mapping with system levers (hardware, placement, energy mix, cooling, scheduling) and calibrated measurement across facility, system, device, and workload levels. This article (i) establishes a unified, operational definition of Green AI distinct from Sustainable AI; (ii) formalizes a five phase lifecycle mapped to Life Cycle Assessment (LCA) stages, making energy, carbon, water, and embodied impacts first class; (iii) specifies governance via Plan Do Check Act (PDCA) cycles with decision gateways; (iv) systematizes hardware and system level strategies across the edge cloud continuum to reduce embodied burdens; and (v) defines a calibrated measurement framework combining estimator models with direct metering to enable reproducible, provider agnostic comparisons. Combining definition, lifecycle processes, hardware strategies, and calibrated measurement, this article offers actionable, evidence based guidance for researchers, practitioners, and policymakers.
AI on Air
Princeton University
Abstract
We present the design and application of a general algorithm for Prediction And Control using MAchiNe learning (PACMAN) in DIII-D. Machine learing (ML)-based predictors and controllers have shown great promise in achieving regimes in which traditional controllers fail, such as tearing mode free scenarios, ELM-free scenarios and stable advanced tokamak conditions. The architecture presented here was deployed on DIII-D to facilitate the end-to-end implementation of advanced control experiments, from diagnostic processing to final actuation commands. This paper describes the detailed design of the algorithm and explains the motivation behind each design point. We also describe several successful ML control experiments in DIII-D using this algorithm, including a reinforcement learning controller targeting advanced non-inductive plasmas, a wide-pedestal quiescent H-mode ELM predictor, an AlfvΓ©n Eigenmode controller, a Model Predictive Control plasma profile controller and a state-machine Tearing Mode predictor-controller. There is also discussion on guiding principles for real-time machine learning controller design and implementation.
AI for Social Good
Abstract
The emergence of crowdsourced data has significantly reshaped social science, enabling extensive exploration of collective human actions, viewpoints, and societal dynamics. However, ensuring safe, fair, and reliable participation remains a persistent challenge. Traditional polling methods have seen a notable decline in engagement over recent decades, raising concerns about the credibility of collected data. Meanwhile, social and peer-to-peer networks have become increasingly widespread, but data from these platforms can suffer from credibility issues due to fraudulent or ineligible participation. In this paper, we explore how social interactions can help restore credibility in crowdsourced data collected over social networks. We present an empirical study to detect ineligible participation in a polling task through AI-based graph analysis of social interactions among imperfect participants composed of honest and dishonest actors. Our approach focuses solely on the structure of social interaction graphs, without relying on the content being shared. We simulate different levels and types of dishonest behavior among participants who attempt to propagate the task within their social networks. We conduct experiments on real-world social network datasets, using different eligibility criteria and modeling diverse participation patterns. Although structural differences in social interaction graphs introduce some performance variability, our study achieves promising results in detecting ineligibility across diverse social and behavioral profiles, with accuracy exceeding 90% in some configurations.
AI on Transportation
Virtual Vehicle Research
Abstract
Conventional road-situation detection methods achieve strong performance in predefined scenarios but fail in unseen cases and lack semantic interpretation, which is crucial for reliable traffic recommendations. This work introduces a multi-agent AI framework that combines multimodal large language models (MLLMs) with vision-based perception for road-situation monitoring. The framework processes camera feeds and coordinates dedicated agents for situation detection, distance estimation, decision-making, and Cooperative Intelligent Transport System (C-ITS) message generation. Evaluation is conducted on a custom dataset of 103 images extracted from 20 videos of the TAD dataset. Both Gemini-2.0-Flash and Gemini-2.5-Flash were evaluated. The results show 100\% recall in situation detection and perfect message schema correctness; however, both models suffer from false-positive detections and have reduced performance in terms of number of lanes, driving lane status and cause code. Surprisingly, Gemini-2.5-Flash, though more capable in general tasks, underperforms Gemini-2.0-Flash in detection accuracy and semantic understanding and incurs higher latency (Table II). These findings motivate further work on fine-tuning specialized LLMs or MLLMs tailored for intelligent transportation applications.
AI on Food
Abstract
This third international workshop on explainable AI for the Arts (XAIxArts) brought together a community of researchers in HCI, Interaction Design, AI, explainable AI (XAI), and digital arts to explore the role of XAI for the Arts. Workshop held at the 17th ACM Conference on Creativity and Cognition (C&C 2025), online.
AI for Social Fairness
UIUC
Abstract
With the growing adoption of AI and machine learning systems in real-world applications, ensuring their fairness has become increasingly critical. The majority of the work in algorithmic fairness focus on assessing and improving the fairness of machine learning systems. There is relatively little research on fairness vulnerability, i.e., how an AI system's fairness can be intentionally compromised. In this work, we first provide a theoretical analysis demonstrating that a simple adversarial poisoning strategy is sufficient to induce maximally unfair behavior in naive Bayes classifiers. Our key idea is to strategically inject a small fraction of carefully crafted adversarial data points into the training set, biasing the model's decision boundary to disproportionately affect a protected group while preserving generalizable performance. To illustrate the practical effectiveness of our method, we conduct experiments across several benchmark datasets and models. We find that our attack significantly outperforms existing methods in degrading fairness metrics across multiple models and datasets, often achieving substantially higher levels of unfairness with a comparable or only slightly worse impact on accuracy. Notably, our method proves effective on a wide range of models, in contrast to prior work, demonstrating a robust and potent approach to compromising the fairness of machine learning systems.
UTEP
Abstract
As machine learning systems move from theory to practice, they are increasingly tasked with decisions that affect healthcare access, financial opportunities, hiring, and public services. In these contexts, accuracy is only one piece of the puzzle - models must also be fair to different groups, protect individual privacy, and remain accountable to stakeholders. Achieving all three is difficult: differential privacy can unintentionally worsen disparities, fairness interventions often rely on sensitive data that privacy restricts, and automated pipelines ignore that fairness is ultimately a human and contextual judgment. We introduce FAIRPLAI (Fair and Private Learning with Active Human Influence), a practical framework that integrates human oversight into the design and deployment of machine learning systems. FAIRPLAI works in three ways: (1) it constructs privacy-fairness frontiers that make trade-offs between accuracy, privacy guarantees, and group outcomes transparent; (2) it enables interactive stakeholder input, allowing decision-makers to select fairness criteria and operating points that reflect their domain needs; and (3) it embeds a differentially private auditing loop, giving humans the ability to review explanations and edge cases without compromising individual data security. Applied to benchmark datasets, FAIRPLAI consistently preserves strong privacy protections while reducing fairness disparities relative to automated baselines. More importantly, it provides a straightforward, interpretable process for practitioners to manage competing demands of accuracy, privacy, and fairness in socially impactful applications. By embedding human judgment where it matters most, FAIRPLAI offers a pathway to machine learning systems that are effective, responsible, and trustworthy in practice. GitHub: https://github.com/Li1Davey/Fairplai

Interests not found

We did not find any papers that match the below interests. Try other terms also consider if the content exists in arxiv.org.
  • AI on Water
  • AI for Social Equity
  • AI for Society
  • AI Air Consumption
  • AI Impacts on Society
You can edit or add more interests any time.