Hi!

Your personalized paper recommendations for 08 to 12 December, 2025.

🎯 Top Personalized Recommendations

Beyond the Hype: Comparing Lightweight and Deep Learning Models for Air Quality Forecasting

National University of

Rate paper: 👍 👎 ♥ Save

Rate image: 👍 👎

AI Summary

The study also explores the impact of different input features on the performance of the models and finds that using both air quality index and weather data improves the predictive power of the models. [3]
AQI: Air Quality Index MAE: Mean Absolute Error The study demonstrates the effectiveness of machine learning models in predicting AQIs and highlights the importance of using both air quality index and weather data for improved predictive power. [3]
The results of this study can be used to inform policy decisions related to air pollution control and mitigation strategies. [3]
The study only evaluates the performance of different models on a single dataset and does not explore the generalizability of the results to other locations or datasets. [3]
The authors do not provide any discussion on the limitations of the study, such as the potential impact of data quality issues or the lack of consideration for non-linear relationships between input features. [3]
The paper presents a comparative study of various machine learning models for predicting air quality indices (AQIs) in Beijing, China. [2]
The results show that the Prophet model outperforms other models in terms of accuracy, with a mean absolute error (MAE) of 4.35 μg/m³. [1]

Abstract
Accurate forecasting of urban air pollution is essential for protecting public health and guiding mitigation policies. While Deep Learning (DL) and hybrid pipelines dominate recent research, their complexity and limited interpretability hinder operational use. This study investigates whether lightweight additive models -- Facebook Prophet (FBP) and NeuralProphet (NP) -- can deliver competitive forecasts for particulate matter (PM$_{2.5}$, PM$_{10}$) in Beijing, China. Using multi-year pollutant and meteorological data, we applied systematic feature selection (correlation, mutual information, mRMR), leakage-safe scaling, and chronological data splits. Both models were trained with pollutant and precursor regressors, with NP additionally leveraging lagged dependencies. For context, two machine learning baselines (LSTM, LightGBM) and one traditional statistical model (SARIMAX) were also implemented. Performance was evaluated on a 7-day holdout using MAE, RMSE, and $R^2$. Results show that FBP consistently outperformed NP, SARIMAX, and the learning-based baselines, achieving test $R^2$ above 0.94 for both pollutants. These findings demonstrate that interpretable additive models remain competitive with both traditional and complex approaches, offering a practical balance of accuracy, transparency, and ease of deployment.

Why we think this paper is great for you:
This paper directly addresses air quality forecasting, a core interest given the user's focus on AI and energy consumption. The comparison of model types aligns with a desire for efficient and practical AI solutions.

Integration of AI-Driven CAD Systems in Designing Water and Power Transportation Infrastructure for Industrial and Remote Landscape Applications

University of Central

Rate paper: 👍 👎 ♥ Save

AI Summary

Digital twins, smart grids, and real-time data integration are emerging technologies that could pair with AI-driven CAD to further enhance the design process and increase capabilities. [3]
Computer-Aided Design (CAD): A software-based system used to create, modify, analyze, or optimize a design. [3]
The integration of Artificial Intelligence (AI) into Computer-Aided Design (CAD) systems has the potential to revolutionize the design process in various industries, including water and power transportation infrastructure. [2]

Abstract
The integration of AI into CAD systems transforms how engineers plan and develop infrastructure projects involving water and power transportation across industrial and remote landscapes. This paper discusses how AI-driven CAD systems improve the efficient, effective, and sustainable design of infrastructure by embedding automation, predictive modeling, and real-time data analytics. This study examines how AI-supported toolsets can enhance design workflows, minimize human error, and optimize resource allocation for projects in underdeveloped environments. It also addresses technical and organizational challenges to AI adoption, including data silos, interoperability issues, and workforce adaptation. The findings demonstrate that AI-powered CAD enables faster project delivery, enhanced design precision, and increased resilience to environmental and logistical constraints. AI helps connect CAD, GIS, and IoT technologies to develop self-learning, adaptive design systems that are needed to meet the increasing global demand for sustainable infrastructure.

Why we think this paper is great for you:
The focus on AI within CAD systems for infrastructure design, particularly water and power transportation, strongly aligns with the user’s interest in AI and energy. It’s a practical application of AI in a relevant sector.

Análisis de rendimiento y eficiencia energética en el cluster Raspberry Pi Cronos

UNGS

Rate paper: 👍 👎 ♥ Save

Rate image: 👍 👎

AI Summary

El clúster Cronos se ubica en un punto intermedio entre los clústers de PCs comunes utilizados en laboratorios universitarios, ofreciendo un rendimiento no competitivo con ellos pero con un bajo consumo, costo reducido y facilidad de replicación. [3]
HPL (High-Performance Linpack): Una herramienta de benchmarking utilizada para evaluar el rendimiento de un sistema. [3]
Green500: Una lista de los 500 sistemas supercomputacionales más eficientes en términos de consumo energético. [3]
El clúster Cronos, basado en Raspberry Pi, logró un rendimiento promedio de 14,48 GFLOPS (±0,26) con una configuración optimizada de cuatro tareas por nodo. [2]
OpenMP (Open Multi-Processing): Un conjunto de directivas y rutinas para escribir código que pueda aprovechar múltiples núcleos de procesamiento. [1]

Abstract
This article presents an evaluation of the computational performance and energy efficiency of the Cronos cluster, composed of Raspberry Pi4 and 3b microcomputers designed for educational purposes. Experimental tests were performed using the High Performance Linpack (HPL) benchmark, under a resource management environment configured with Slurm and parallel communication via Open MPI. The study focuses on analyzing scalability, stability, and power consumption during the execution of computationally intensive workloads, considering different node configurations. The results show that the cluster achieves a performance of up to 6.91 GFLOPS in homogeneous configurations of 6 Raspberry Pi 4 nodes, and that the use of heterogeneous nodes (including Raspberry Pi 3b) can negatively impact stability and efficiency. Additionally, the total electrical consumption of the system was measured during the runs, allowing for the estimation of the performance-to-consumption ratio (GFLOPS/W) as a comparative metric. This study constitutes a concrete contribution to the design, evaluation, and utilization of low-cost ARM clusters in educational and research contexts.

Why we think this paper is great for you:
This paper investigates energy efficiency, a key area of interest for the user. The use of Raspberry Pi clusters provides a tangible and accessible example of AI and energy optimization.

Deconstructing the Dual Black Box:A Plug-and-Play Cognitive Framework for Human-AI Collaborative Enhancement and Its Implications for AI Governance

Northeastern University

Rate paper: 👍 👎 ♥ Save

Rate image: 👍 👎

AI Summary

RAMTN系统是一种基于元交互的人机协作认知增强范式，旨在通过提取专家决策框架来实现智能辅助和知识共享。该系统的核心思想是将人类专家的认知过程与计算机系统的信息处理能力结合起来，从而实现高效的决策支持和知识推理。 RAMTN系统的应用领域包括投资、医疗和教育等多个领域，旨在通过提取专家决策框架来提高决策准确性和效率。元交互（Meta-Interaction）：一种将人类认知过程与计算机系统信息处理能力结合起来的技术，旨在实现高效的决策支持和知识推理。人机协作认知增强范式（Human-Machine Collaborative Cognition Enhancement Paradigm）：一种基于元交互的框架，旨在通过提取专家决策框架来实现智能辅助和知识共享。 RAMTN系统是一种创新性的解决方案，旨在通过提取专家决策框架来提高决策准确性和效率。该系统的应用领域包括投资、医疗和教育等多个领域，具有广泛的潜力和前景。该系统的开发和应用依赖于大量的数据和信息资源，可能存在数据质量和可靠性的问题。该系统的安全性和隐私保护需要进一步研究和解决。元交互技术在决策支持和知识推理领域有广泛的应用和研究。 [3]

Abstract
Currently, there exists a fundamental divide between the "cognitive black box" (implicit intuition) of human experts and the "computational black box" (untrustworthy decision-making) of artificial intelligence (AI). This paper proposes a new paradigm of "human-AI collaborative cognitive enhancement," aiming to transform the dual black boxes into a composable, auditable, and extensible "functional white-box" system through structured "meta-interaction." The core breakthrough lies in the "plug-and-play cognitive framework"--a computable knowledge package that can be extracted from expert dialogues and loaded into the Recursive Adversarial Meta-Thinking Network (RAMTN). This enables expert thinking, such as medical diagnostic logic and teaching intuition, to be converted into reusable and scalable public assets, realizing a paradigm shift from "AI as a tool" to "AI as a thinking partner." This work not only provides the first engineering proof for "cognitive equity" but also opens up a new path for AI governance: constructing a verifiable and intervenable governance paradigm through "transparency of interaction protocols" rather than prying into the internal mechanisms of models. The framework is open-sourced to promote technology for good and cognitive inclusion. This paper is an independent exploratory research conducted by the author. All content presented, including the theoretical framework (RAMTN), methodology (meta-interaction), system implementation, and case validation, constitutes the author's individual research achievements.

Why we think this paper is great for you:
The exploration of human-AI collaboration and addressing the 'black box' problem is highly relevant to the user's interest in AI and its governance implications.

The Adoption and Usage of AI Agents: Early Evidence from Perplexity

Perplexity

Rate paper: 👍 👎 ♥ Save

Rate image: 👍 👎

AI Summary

The agent is used primarily for productivity-related tasks (36% of all queries), followed by learning, media, and shopping. [3]
Research, document editing, and shopping-related tasks appear consistently across occupation clusters. [3]
Knowledge-intensive sectors like digital technology, entrepreneurship, finance, and academia tend to use the agent for research and learning-related tasks. [3]
Productivity and learning topics are the most sticky, while travel is the least sticky. [2]
Users' first queries often fall into productivity, learning, or media topics, but over time, there's a shift towards more cognitively oriented use cases. [1]

Abstract
This paper presents the first large-scale field study of the adoption, usage intensity, and use cases of general-purpose AI agents operating in open-world web environments. Our analysis centers on Comet, an AI-powered browser developed by Perplexity, and its integrated agent, Comet Assistant. Drawing on hundreds of millions of anonymized user interactions, we address three fundamental questions: Who is using AI agents? How intensively are they using them? And what are they using them for? Our findings reveal substantial heterogeneity in adoption and usage across user segments. Earlier adopters, users in countries with higher GDP per capita and educational attainment, and individuals working in digital or knowledge-intensive sectors -- such as digital technology, academia, finance, marketing, and entrepreneurship -- are more likely to adopt or actively use the agent. To systematically characterize the substance of agent usage, we introduce a hierarchical agentic taxonomy that organizes use cases across three levels: topic, subtopic, and task. The two largest topics, Productivity & Workflow and Learning & Research, account for 57% of all agentic queries, while the two largest subtopics, Courses and Shopping for Goods, make up 22%. The top 10 out of 90 tasks represent 55% of queries. Personal use constitutes 55% of queries, while professional and educational contexts comprise 30% and 16%, respectively. In the short term, use cases exhibit strong stickiness, but over time users tend to shift toward more cognitively oriented topics. The diffusion of increasingly capable AI agents carries important implications for researchers, businesses, policymakers, and educators, inviting new lines of inquiry into this rapidly emerging class of AI capabilities.

Why we think this paper is great for you:
This paper examines the use of AI agents, a rapidly developing area of AI, and its practical applications through the Perplexity platform, aligning with the user’s broader AI interests.

AI TIPS 2.0: A Comprehensive Framework for Operationalizing AI Governance

Trusted AI

Rate paper: 👍 👎 ♥ Save

AI Summary

AI TIPS 2.0 is a comprehensive framework for operationalizing AI governance The framework consists of six phases: Data Collection & Preparation, Model Development & Training, Evaluation & Validation, Deployment & Operations, Monitoring & Continuous Improvement, and Retirement Each phase has specific objectives, focus areas, minimum pillar scores, key AICM controls, and deliverables The framework also includes role-based scorecard dashboards for different organizational roles, ensuring appropriate oversight and actionable insights at each level AICM: AI Control Measures (a set of controls to ensure the secure development and deployment of AI systems) DPIA/PIA: Data Protection Impact Assessment/Privacy Impact Assessment (an assessment of the potential risks and impacts on data protection and privacy) EU AI Act: European Union Artificial Intelligence Act (regulatory framework for AI in the EU) RACI matrix: Responsible, Accountable, Consulted, Informed matrix (a tool to define roles and responsibilities) AI TIPS 2.0 provides a structured approach to operationalizing AI governance, ensuring that AI systems are developed and deployed securely and responsibly The framework is designed to be flexible and adaptable to different organizational needs and contexts [2]

Abstract
The deployment of AI systems faces three critical governance challenges that current frameworks fail to adequately address. First, organizations struggle with inadequate risk assessment at the use case level, exemplified by the Humana class action lawsuit and other high impact cases where an AI system deployed to production exhibited both significant bias and high error rates, resulting in improper healthcare claim denials. Each AI use case presents unique risk profiles requiring tailored governance, yet most frameworks provide one size fits all guidance. Second, existing frameworks like ISO 42001 and NIST AI RMF remain at high conceptual levels, offering principles without actionable controls, leaving practitioners unable to translate governance requirements into specific technical implementations. Third, organizations lack mechanisms for operationalizing governance at scale, with no systematic approach to embed trustworthy AI practices throughout the development lifecycle, measure compliance quantitatively, or provide role-appropriate visibility from boards to data scientists. We present AI TIPS, Artificial Intelligence Trust-Integrated Pillars for Sustainability 2.0, update to the comprehensive operational framework developed in 2019,four years before NIST's AI Risk Management Framework, that directly addresses these challenges.

Why we think this paper is great for you:
The focus on AI governance frameworks directly addresses the user’s interest in the ethical and responsible development of AI systems.

A Theoretical Framework of Student Agency in AI- Assisted Learning: A Grounded Theory Approach

The Chinese University of

Rate paper: 👍 👎 ♥ Save

Rate image: 👍 👎

AI Summary

Policymakers can draw on this research to formulate educational policies and frameworks that balance technological advancement with the cultivation of essential learner skills and dispositions. [3]
Student agency: The proactive and intentional efforts students make in managing their educational tools, resources, and experiences. [2]
The study examines how students exercise agency in AI-assisted learning and identifies four types of agentic engagement: initiating and (re)directing, mindful adoption, external help-seeking, and reflective learning. [1]

Abstract
Generative AI(GenAI) is a kind of AI model capable of producing human-like content in various modalities, including text, image, audio, video, and computer programming. Although GenAI offers great potential for education, its value often depends on students' ability to engage with it actively, responsibly, and critically - qualities central to student agency. Nevertheless, student agency has long been a complex and ambiguous concept in educational discourses, with few empirical studies clarifying its distinct nature and process in AI-assisted learning environments. To address this gap, the qualitative study presented in this article examines how higher education students exercise agency in AI-assisted learning and proposes a theoretical framework using a grounded theory approach. Guided by agentic engagement theory, this article analyzes the authentic experiences of 26 students using data from their GenAI conversation records and cognitive interviews that capture their thought processes and decision-making. The findings identify four key aspects of student agency: initiating and (re)directing, mindful adoption, external help-seeking, and reflective learning. Together, these aspects form an empirically developed framework that characterizes student agency in AI-assisted learning as a proactive, intentional, adaptive, reflective, and iterative process. Based on the empirical findings, theoretical and practical implications are discussed for researchers, educators, and policymakers.

Why we think this paper is great for you:
This paper investigates student agency in AI-assisted learning, which is closely related to the user's interest in AI and education, particularly how AI impacts learning processes.

AI Energy Consumption

Energy-Aware Aggregation of Input Data for the Optimisation of Heat Supply of Municipal Districts

German Aerospace Center

Rate paper: 👍 👎 ♥ Save

Abstract
In the context of municipal heat planning, it is imperative to consider the numerous buildings, numbering in the hundreds or thousands, that are involved. This poses particular challenges for model-based energy system optimization, as the number of variables increases with the number of buildings under consideration. In the worst case, the computational complexity of the models experiences an exponential increase with the number of variables. Furthermore, within the context of heat transition, it is often necessary to map extended periods of time (i.e., the service life of systems) with high resolution (particularly in the case of load peaks that occur at the onset of the day). In response to these challenges, the aggregation of input data is a common practice. In general, building blocks or other geographical and urban formations, such as neighbourhoods, are combined. This article explores the potential of incorporating energy performance indicators into the grouping of buildings. The case study utilizes authentic data from the Neu-Schwachhausen district, grouped based on geographical location, building geometry, and energy performance indicators. The selection of energy indicators includes the annual heat consumption as well as the potential for solar energy generation. To this end, a methodology is hereby presented that considers not only the anticipated annual energy quantity, but also its progression over time. We present a full workflow from geodata to a set of techno-socio-economically Pareto-optimal heat supply options. Our findings suggest that it is beneficial to find a balance between geographical position and energy properties when grouping buildings for the use in energy system models.

AI Water Consumption

An AI-Powered Autonomous Underwater System for Sea Exploration and Scientific Research

Zayed University

Rate paper: 👍 👎 ♥ Save

Abstract
Traditional sea exploration faces significant challenges due to extreme conditions, limited visibility, and high costs, resulting in vast unexplored ocean regions. This paper presents an innovative AI-powered Autonomous Underwater Vehicle (AUV) system designed to overcome these limitations by automating underwater object detection, analysis, and reporting. The system integrates YOLOv12 Nano for real-time object detection, a Convolutional Neural Network (CNN) (ResNet50) for feature extraction, Principal Component Analysis (PCA) for dimensionality reduction, and K-Means++ clustering for grouping marine objects based on visual characteristics. Furthermore, a Large Language Model (LLM) (GPT-4o Mini) is employed to generate structured reports and summaries of underwater findings, enhancing data interpretation. The system was trained and evaluated on a combined dataset of over 55,000 images from the DeepFish and OzFish datasets, capturing diverse Australian marine environments. Experimental results demonstrate the system's capability to detect marine objects with a mAP@0.5 of 0.512, a precision of 0.535, and a recall of 0.438. The integration of PCA effectively reduced feature dimensionality while preserving 98% variance, facilitating K-Means clustering which successfully grouped detected objects based on visual similarities. The LLM integration proved effective in generating insightful summaries of detections and clusters, supported by location data. This integrated approach significantly reduces the risks associated with human diving, increases mission efficiency, and enhances the speed and depth of underwater data analysis, paving the way for more effective scientific research and discovery in challenging marine environments.

AI Summary

The system maintained consistent inference speeds between 2.0ms and 5.5ms and processed external video data to validate its performance. [3]
CNN: Convolutional Neural Network, a type of neural network designed to process data with grid-like topology, commonly used in image and signal processing. [3]
PCA: Principal Component Analysis, a dimensionality reduction technique that transforms high-dimensional data into lower-dimensional space while retaining most of the information. [3]
The system uses YOLOv12 Nano to detect and distinguish over 50% of annotated objects under challenging marine conditions. [2]

AI for Social Equality

The Gender Code: Gendering the Global Governance of Artificial Intelligence

WZB Berlin Social Science

Rate paper: 👍 👎 ♥ Save

Abstract
This paper examines how international AI governance frameworks address gender issues and gender-based harms. The analysis covers binding regulations, such as the EU AI Act; soft law instruments, like the UNESCO Recommendations on AI Ethics; and global initiatives, such as the Global Partnership on AI (GPAI). These instruments reveal emerging trends, including the integration of gender concerns into broader human rights frameworks, a shift toward explicit gender-related provisions, and a growing emphasis on inclusivity and diversity. Yet, some critical gaps persist, including inconsistent treatment of gender across governance documents, limited engagement with intersectionality, and a lack of robust enforcement mechanisms. However, this paper argues that effective AI governance must be intersectional, enforceable, and inclusive. This is key to moving beyond tokenism toward meaningful equity and preventing reinforcement of existing inequalities. The study contributes to ethical AI debates by highlighting the importance of gender-sensitive governance in building a just technological future.

AI Summary

Global AI governance is increasingly emphasizing the importance of gender equality and addressing gender-related AI harm. [2]

Robust AI Security and Alignment: A Sisyphean Endeavor?

NIST

Rate paper: 👍 👎 ♥ Save

Abstract
This manuscript establishes information-theoretic limitations for robustness of AI security and alignment by extending Gödel's incompleteness theorem to AI. Knowing these limitations and preparing for the challenges they bring is critically important for the responsible adoption of the AI technology. Practical approaches to dealing with these challenges are provided as well. Broader implications for cognitive reasoning limitations of AI systems are also proven.

AI Summary

The authors argue that Godel's theorem has significant implications for AI and ML, particularly in the context of large language models. [3]
Godel's theorem: A mathematical proof that shows that any formal system powerful enough to describe basic arithmetic is either incomplete or inconsistent. [3]
Large language models: Deep neural networks designed to process and generate human-like language. [3]
The authors assume a high level of mathematical background, which may limit the accessibility of the paper to non-experts. [3]
The paper discusses the limitations of formal systems and their implications for artificial intelligence (AI) and machine learning (ML). [2]

AI for Social Fairness

A Gray Literature Study on Fairness Requirements in AI-enabled Software Engineering

University of Calgary

Rate paper: 👍 👎 ♥ Save

Abstract
Today, with the growing obsession with applying Artificial Intelligence (AI), particularly Machine Learning (ML), to software across various contexts, much of the focus has been on the effectiveness of AI models, often measured through common metrics such as F1- score, while fairness receives relatively little attention. This paper presents a review of existing gray literature, examining fairness requirements in AI context, with a focus on how they are defined across various application domains, managed throughout the Software Development Life Cycle (SDLC), and the causes, as well as the corresponding consequences of their violation by AI models. Our gray literature investigation shows various definitions of fairness requirements in AI systems, commonly emphasizing non-discrimination and equal treatment across different demographic and social attributes. Fairness requirement management practices vary across the SDLC, particularly in model training and bias mitigation, fairness monitoring and evaluation, and data handling practices. Fairness requirement violations are frequently linked, but not limited, to data representation bias, algorithmic and model design bias, human judgment, and evaluation and transparency gaps. The corresponding consequences include harm in a broad sense, encompassing specific professional and societal impacts as key examples, stereotype reinforcement, data and privacy risks, and loss of trust and legitimacy in AI-supported decisions. These findings emphasize the need for consistent frameworks and practices to integrate fairness into AI software, paying as much attention to fairness as to effectiveness.

AI Summary

The study found that fairness requirements in AI are often context-dependent and tailored to specific applications rather than a universal metric. [2]

Mind the Gap! Pathways Towards Unifying AI Safety and Ethics Research

Carnegie Mellon

Rate paper: 👍 👎 ♥ Save

Abstract
While much research in artificial intelligence (AI) has focused on scaling capabilities, the accelerating pace of development makes countervailing work on producing harmless, "aligned" systems increasingly urgent. Yet research on alignment has diverged along two largely parallel tracks: safety--centered on scaled intelligence, deceptive or scheming behaviors, and existential risk--and ethics--focused on present harms, the reproduction of social bias, and flaws in production pipelines. Although both communities warn of insufficient investment in alignment, they disagree on what alignment means or ought to mean. As a result, their efforts have evolved in relative isolation, shaped by distinct methodologies, institutional homes, and disciplinary genealogies. We present a large-scale, quantitative study showing the structural split between AI safety and AI ethics. Using a bibliometric and co-authorship network analysis of 6,442 papers from twelve major ML and NLP conferences (2020-2025), we find that over 80% of collaborations occur within either the safety or ethics communities, and cross-field connectivity is highly concentrated: roughly 5% of papers account for more than 85% of bridging links. Removing a small number of these brokers sharply increases segregation, indicating that cross-disciplinary exchange depends on a handful of actors rather than broad, distributed collaboration. These results show that the safety-ethics divide is not only conceptual but institutional, with implications for research agendas, policy, and venues. We argue that integrating technical safety work with normative ethics--via shared benchmarks, cross-institutional venues, and mixed-method methodologies--is essential for building AI systems that are both robust and just.

AI Summary

The dataset consists of 49,725 papers from various conferences related to artificial intelligence (AI) safety and ethics. [3]
Abstract enrichment coverage: The percentage of papers with abstracts. [3]
Keywords were generated by analyzing foundational surveys and texts in each field, with a hierarchical strategy spanning technical, theoretical, and applied domains. [2]
The abstract enrichment coverage is 97.7%, indicating that most papers have abstracts. [1]

AI for Social Justice

Ethics Readiness of Artificial Intelligence: A Practical Evaluation Method

CEASaclay

Rate paper: 👍 👎 ♥ Save

Abstract
We present Ethics Readiness Levels (ERLs), a four-level, iterative method to track how ethical reflection is implemented in the design of AI systems. ERLs bridge high-level ethical principles and everyday engineering by turning ethical values into concrete prompts, checks, and controls within real use cases. The evaluation is conducted using a dynamic, tree-like questionnaire built from context-specific indicators, ensuring relevance to the technology and application domain. Beyond being a managerial tool, ERLs help facilitate a structured dialogue between ethics experts and technical teams, while our scoring system helps track progress over time. We demonstrate the methodology through two case studies: an AI facial sketch generator for law enforcement and a collaborative industrial robot. The ERL tool effectively catalyzes concrete design changes and promotes a shift from narrow technological solutionism to a more reflective, ethics-by-design mindset.

AI Summary

Technological solutionism: The tendency to view complex problems as solvable through technological fixes rather than addressing underlying issues. [3]
Ethics-by-design is a powerful approach that can significantly improve the ethics readiness of AI systems. [2]

AI on Air

RoboNeuron: A Modular Framework Linking Foundation Models and ROS for Embodied AI

Chinese Academy of Scienc

Rate paper: 👍 👎 ♥ Save

Abstract
Current embodied AI systems face severe engineering impediments, primarily characterized by poor cross-scenario adaptability, rigid inter-module coupling, and fragmented inference acceleration. To overcome these limitations, we propose RoboNeuron, a universal deployment framework for embodied intelligence. RoboNeuron is the first framework to deeply integrate the cognitive capabilities of Large Language Models (LLMs) and Vision-Language-Action (VLA) models with the real-time execution backbone of the Robot Operating System (ROS). We utilize the Model Context Protocol (MCP) as a semantic bridge, enabling the LLM to dynamically orchestrate underlying robotic tools. The framework establishes a highly modular architecture that strictly decouples sensing, reasoning, and control by leveraging ROS's unified communication interfaces. Crucially, we introduce an automated tool to translate ROS messages into callable MCP functions, significantly streamlining development. RoboNeuron significantly enhances cross-scenario adaptability and component flexibility, while establishing a systematic platform for horizontal performance benchmarking, laying a robust foundation for scalable real-world embodied applications.

AI Summary

RoboNeuron provides a scalable foundation for embodied intelligence research, and future work will expand model coverage and integrate inference acceleration to further strengthen its universality. [3]
RoboNeuron is a universal deployment framework that addresses the core engineering barriers of embodied AI by unifying LLM/VLA reasoning with ROS's real-time execution layer through the Model Context Protocol. [2]
The framework enables seamless substitution of hardware, sensors, and VLA models with minimal reconfiguration. [1]

AI on Education

Human Agency and Creativity in AI-Assisted Learning Environments

The Chinese University of

Rate paper: 👍 👎 ♥ Save

Abstract
This chapter explores human creativity in AI-assisted learning environments through the lens of student agency. We begin by examining four theoretical perspectives on agency, including instrumental, effortful, dynamically emergent, and authorial agency, and analyze how each frames the relationship between agency and creativity. Under each theoretical perspective, we discuss how the integration of generative AI (GenAI) tools reshapes these dynamics by altering students' roles in cognitive, social, and creative processes. In the second part, we introduce a theoretical framework for AI agentic engagement, contextualizing agency within specific cognitive, relational, and ethical dynamics introduced by GenAI tools. This framework is linked to the concept of Mini-c creativity, emphasizing personal relevance and self-directed learning. Together, these perspectives support a shift from viewing creativity as product-oriented to understanding it as a process of agentive participation and meaning-making. We conclude with two directions for future research focused on the creative process and performance in AI-assisted learning.

AI Summary

Agentic engagement refers to the proactive, intentional, and reflective actions taken by students when interacting with AI tools. [3]
Mini-c Creativity: A type of creative expression that is personally meaningful and novel, but not necessarily recognized externally. [3]
Student agency is essential for effective use of AI tools in education. [2]
The integration of AI in learning environments is changing how we understand student creativity. [1]

AI on Energy

Advancing Mathematical Research via Human-AI Interactive Theorem Proving

Peking University

Rate paper: 👍 👎 ♥ Save

Abstract
We investigate how large language models can be used as research tools in scientific computing while preserving mathematical rigor. We propose a human-in-the-loop workflow for interactive theorem proving and discovery with LLMs. Human experts retain control over problem formulation and admissible assumptions, while the model searches for proofs or contradictions, proposes candidate properties and theorems, and helps construct structures and parameters that satisfy explicit constraints, supported by numerical experiments and simple verification checks. Experts treat these outputs as raw material, further refine them, and organize the results into precise statements and rigorous proofs. We instantiate this workflow in a case study on the connection between manifold optimization and Grover's quantum search algorithm, where the pipeline helps identify invariant subspaces, explore Grover-compatible retractions, and obtain convergence guarantees for the retraction-based gradient method. The framework provides a practical template for integrating large language models into frontier mathematical research, enabling faster exploration of proof space and algorithm design while maintaining transparent reasoning responsibilities. Although illustrated on manifold optimization problems in quantum computing, the principles extend to other core areas of scientific computing.

AI Summary

Previous research has shown that human-AI collaboration can improve performance in various tasks, including theorem discovery and proof verification. [3]
The collaboration between human experts and an LLM is organized into three stages, starting from an informal conjecture and ending with a precise theorem and proof. [2]
Human-AI collaboration can significantly improve mathematical proof and theorem discovery. [1]

Automating High Energy Physics Data Analysis with LLM-Powered Agents

University of California

Rate paper: 👍 👎 ♥ Save

Abstract
We present a proof-of-principle study demonstrating the use of large language model (LLM) agents to automate a representative high energy physics (HEP) analysis. Using the Higgs boson diphoton cross-section measurement as a case study with ATLAS Open Data, we design a hybrid system that combines an LLM-based supervisor-coder agent with the Snakemake workflow manager. In this architecture, the workflow manager enforces reproducibility and determinism, while the agent autonomously generates, executes, and iteratively corrects analysis code in response to user instructions. We define quantitative evaluation metrics including success rate, error distribution, costs per specific task, and average number of API calls, to assess agent performance across multi-stage workflows. To characterize variability across architectures, we benchmark a representative selection of state-of-the-art LLMs spanning the Gemini and GPT-5 series, the Claude family, and leading open-weight models. While the workflow manager ensures deterministic execution of all analysis steps, the final outputs still show stochastic variation. Although we set the temperature to zero, other sampling parameters (e.g., top-p, top-k) remained at their defaults, and some reasoning-oriented models internally adjust these settings. Consequently, the models do not produce fully deterministic results. This study establishes the first LLM-agent-driven automated data-analysis framework in HEP, enabling systematic benchmarking of model capabilities, stability, and limitations in real-world scientific computing environments. The baseline code used in this work is available at https://huggingface.co/HWresearch/LLM4HEP. This work was accepted as a poster at the Machine Learning and the Physical Sciences (ML4PS) workshop at NeurIPS 2025. The initial submission was made on August 30, 2025.

AI on Food

Mental Models of Autonomy and Sentience Shape Reactions to AI

Sentience Institute

Rate paper: 👍 👎 ♥ Save

Abstract
Narratives about artificial intelligence (AI) entangle autonomy, the capacity to self-govern, with sentience, the capacity to sense and feel. AI agents that perform tasks autonomously and companions that recognize and express emotions may activate mental models of autonomy and sentience, respectively, provoking distinct reactions. To examine this possibility, we conducted three pilot studies (N = 374) and four preregistered vignette experiments describing an AI as autonomous, sentient, both, or neither (N = 2,702). Activating a mental model of sentience increased general mind perception (cognition and emotion) and moral consideration more than autonomy, but autonomy increased perceived threat more than sentience. Sentience also increased perceived autonomy more than vice versa. Based on a within-paper meta-analysis, sentience changed reactions more than autonomy on average. By disentangling different mental models of AI, we can study human-AI interaction with more precision to better navigate the detailed design of anthropomorphized AI and prompting interfaces.

AI Summary

In all three experiments, participants were shown a description of an AI system called Corion, which was said to be working on various tasks. [3]
The results show that when people are told that the AI is autonomous but not sentient, they tend to perceive it as having a mind and being capable of making decisions. [3]
However, when people are told that the AI is both autonomous and sentient, they tend to have even more positive perceptions of its capabilities and decision-making abilities. [3]
The findings have implications for the development of AI systems, as well as for the design of interfaces and interactions between humans and AI. [3]
The study investigates how people's mental models of artificial intelligence (AI) are influenced by the concepts of autonomy and sentience. [2]
The researchers conducted three experiments, each with a different manipulation: Experiment 1 manipulated autonomy, Experiment 2 manipulated sentience, and Experiment 3 fully crossed both autonomy and sentience. [1]

Food Image Generation on Multi-Noun Categories

Purdue University

Rate paper: 👍 👎 ♥ Save

Abstract
Generating realistic food images for categories with multiple nouns is surprisingly challenging. For instance, the prompt "egg noodle" may result in images that incorrectly contain both eggs and noodles as separate entities. Multi-noun food categories are common in real-world datasets and account for a large portion of entries in benchmarks such as UEC-256. These compound names often cause generative models to misinterpret the semantics, producing unintended ingredients or objects. This is due to insufficient multi-noun category related knowledge in the text encoder and misinterpretation of multi-noun relationships, leading to incorrect spatial layouts. To overcome these challenges, we propose FoCULR (Food Category Understanding and Layout Refinement) which incorporates food domain knowledge and introduces core concepts early in the generation process. Experimental results demonstrate that the integration of these techniques improves image generation performance in the food domain.

AI Summary

The model is trained on the LAION-5B dataset and achieves state-of-the-art results in various evaluation metrics. [3]
The model is trained on a large dataset and achieves state-of-the-art results in various evaluation metrics. [3]
Hierarchical text-conditional image generation: A method of generating images by first predicting the attributes of the image and then generating the image itself based on those attributes. [3]
The model is trained on the LAION-5B dataset and achieves state-of-the-art results in various evaluation metrics. [3]
The model is trained on a large dataset and achieves state-of-the-art results in various evaluation metrics. [3]
The paper presents a novel approach to text-to-image synthesis using a hierarchical text-conditional image generation model with CLIP latents. [2]
Diffusion-based image synthesis: A method of generating images by iteratively refining an initial noise signal until it converges to a realistic image. [1]

AI on Healthcare

Monitoring Deployed AI Systems in Health Care

Stanford

Rate paper: 👍 👎 ♥ Save

Abstract
Post-deployment monitoring of artificial intelligence (AI) systems in health care is essential to ensure their safety, quality, and sustained benefit-and to support governance decisions about which systems to update, modify, or decommission. Motivated by these needs, we developed a framework for monitoring deployed AI systems grounded in the mandate to take specific actions when they fail to behave as intended. This framework, which is now actively used at Stanford Health Care, is organized around three complementary principles: system integrity, performance, and impact. System integrity monitoring focuses on maximizing system uptime, detecting runtime errors, and identifying when changes to the surrounding IT ecosystem have unintended effects. Performance monitoring focuses on maintaining accurate system behavior in the face of changing health care practices (and thus input data) over time. Impact monitoring assesses whether a deployed system continues to have value in the form of benefit to clinicians and patients. Drawing on examples of deployed AI systems at our academic medical center, we provide practical guidance for creating monitoring plans based on these principles that specify which metrics to measure, when those metrics should be reviewed, who is responsible for acting when metrics change, and what concrete follow-up actions should be taken-for both traditional and generative AI. We also discuss challenges to implementing this framework, including the effort and cost of monitoring for health systems with limited resources and the difficulty of incorporating data-driven monitoring practices into complex organizations where conflicting priorities and definitions of success often coexist. This framework offers a practical template and starting point for health systems seeking to ensure that AI deployments remain safe and effective over time.

AI Summary

Traditional and generative AI systems require unique monitoring considerations for deployment in clinical systems. [3]
Performance monitoring: Evaluates the longitudinal accuracy and quality of AI system outputs to detect drift. [3]
Impact monitoring: Verifies if the AI system produces sustained benefits to patients, health system staff, or health system finances over time. [3]
The framework is applicable to both traditional and generative AI systems and can be tailored to specific use cases and deployments. [3]
Post-deployment AI monitoring is crucial for ensuring the safety and effectiveness of AI systems in healthcare. [2]

When Medical AI Explanations Help and When They Harm

Peking University

Rate paper: 👍 👎 ♥ Save

Abstract
We document a fundamental paradox in AI transparency: explanations improve decisions when algorithms are correct but systematically worsen them when algorithms err. In an experiment with 257 medical students making 3,855 diagnostic decisions, we find explanations increase accuracy by 6.3 percentage points when AI is correct (73% of cases) but decrease it by 4.9 points when incorrect (27% of cases). This asymmetry arises because modern AI systems generate equally persuasive explanations regardless of recommendation quality-physicians cannot distinguish helpful from misleading guidance. We show physicians treat explained AI as 15.2 percentage points more accurate than reality, with over-reliance persisting even for erroneous recommendations. Competent physicians with appropriate uncertainty suffer most from the AI transparency paradox (-12.4pp when AI errs), while overconfident novices benefit most (+9.9pp net). Welfare analysis reveals that selective transparency generates \$2.59 billion in annual healthcare value, 43% more than the \$1.82 billion from mandated universal transparency.

AI Summary

Explanations can improve diagnostic accuracy when AI advice is correct (+6.3 percentage points) but systematically impair accuracy when AI advice is incorrect (-4.9 percentage points). [3]
AI Transparency Paradox: The phenomenon where explanations improve diagnostic accuracy when AI advice is correct but harm accuracy when AI advice is incorrect. [3]
The AI Transparency Paradox highlights the need for careful consideration of how explanations are presented, as they can have both positive and negative effects on decision-making. [3]
Explanations can create false certainty when AI advice is incorrect, which can lead to poor decisions. [3]
The structure of the AI Transparency Paradox is criticaly moderated by the advice format, with probabilistic formats amplifying harm rather than mitigating it. [2]
Explanations can create false certainty precisely when caution is most needed, as they mechanically increase the sum of squared probabilities (SSQ) by shifting probability mass toward the recommended option. [1]

Interests not found

We did not find any papers that match the below interests. Try other terms also consider if the content exists in arxiv.org.

AI for Social Good
AI for Society

You can edit or add more interests any time.

Help us improve your experience!

This project is on its early stages your feedback can be pivotal on the future of the project. Let us know what you think about this week's papers and suggestions!

Give Feedback