AI for Society

AI for Scientific Discovery is a Social Problem

Hugging Face

Rate this image: 😍 👍 👎

Abstract
Artificial intelligence promises to accelerate scientific discovery, yet its benefits remain unevenly distributed. While technical obstacles such as scarce data, fragmented standards, and unequal access to computation are significant, we argue that the primary barriers are social and institutional. Narratives that defer progress to speculative "AI scientists," the undervaluing of data and infrastructure contributions, misaligned incentives, and gaps between domain experts and machine learning researchers all constrain impact. We highlight four interconnected challenges: community dysfunction, research priorities misaligned with upstream needs, data fragmentation, and infrastructure inequities. We argue that their roots lie in cultural and organizational practices. Addressing them requires not only technical innovation but also intentional community-building, cross-disciplinary education, shared benchmarks, and accessible infrastructure. We call for reframing AI for science as a collective social project, where sustainable collaboration and equitable participation are treated as prerequisites for technical progress.

AI Insights

Democratizing advanced cyberinfrastructure unlocks responsible AI research across global labs.
Only 5 % of Africa’s AI talent accesses sufficient compute, underscoring regional inequity.
Pre‑trained transformer models now generate multi‑omics, multi‑species, multi‑tissue samples.
Quantization‑aware training yields efficient neural PDE‑solvers showcased at recent conferences.
The FAIR Guiding Principles guide scientific data stewardship, enhancing reproducibility.
MAGE‑Tab’s spreadsheet‑based format standardizes microarray data for seamless sharing.
Resources like The Human Cell Atlas and pymatgen empower interdisciplinary material‑genomics research.

👍 👎 ♥ Save

We Need a New Ethics for a World of AI Agents

Abstract
The deployment of capable AI agents raises fresh questions about safety, human-machine relationships and social coordination. We argue for greater engagement by scientists, scholars, engineers and policymakers with the implications of a world increasingly populated by AI agents. We explore key challenges that must be addressed to ensure that interactions between humans and agents, and among agents themselves, remain broadly beneficial.

AI Air Consumption

👍 👎 ♥ Save

AI Wellbeing

Abstract
Under what conditions would an artificially intelligent system have wellbeing? Despite its obvious bearing on the ethics of human interactions with artificial systems, this question has received little attention. Because all major theories of wellbeing hold that an individual's welfare level is partially determined by their mental life, we begin by considering whether artificial systems have mental states. We show that a wide range of theories of mental states, when combined with leading theories of wellbeing, predict that certain existing artificial systems have wellbeing. While we do not claim to demonstrate conclusively that AI systems have wellbeing, we argue that our metaphysical and moral uncertainty about AI wellbeing requires us dramatically to reassess our relationship with the intelligent systems we create.

👍 👎 ♥ Save

The Carbon Footprint Wizard: A Knowledge-Augmented AI Interface for Streamlining Food Carbon Footprint Analysis

Vrije Universiteit, The N

Abstract
Environmental sustainability, particularly in relation to climate change, is a key concern for consumers, producers, and policymakers. The carbon footprint, based on greenhouse gas emissions, is a standard metric for quantifying the contribution to climate change of activities and is often assessed using life cycle assessment (LCA). However, conducting LCA is complex due to opaque and global supply chains, as well as fragmented data. This paper presents a methodology that combines advances in LCA and publicly available databases with knowledge-augmented AI techniques, including retrieval-augmented generation, to estimate cradle-to-gate carbon footprints of food products. We introduce a chatbot interface that allows users to interactively explore the carbon impact of composite meals and relate the results to familiar activities. A live web demonstration showcases our proof-of-concept system with arbitrary food items and follow-up questions, highlighting both the potential and limitations - such as database uncertainties and AI misinterpretations - of delivering LCA insights in an accessible format.

AI Insights

Vegetarian pizza’s carbon footprint ranges 0.139–0.840 kg CO₂‑eq, avg 0.490 kg.
Equivalent to 122 emails, 2 h of TV, or 1.4 mi in a Fiat 500.
Mozzarella (75 g) can emit up to 0.486 kg CO₂‑eq; oregano (5 g) only 0.002 kg.
Data sourced from AgriBalyse and BigClimate, using their min‑max impact tables.
Accuracy depends on database completeness; seasonality, location, and production methods shift results.
Carbon Footprint = total GHG emissions of an activity; GHGs include CO₂, CH₄, N₂O.
Read Pollan’s “Omnivore’s Dilemma”, Wallace‑Wells’ “Uninhabitable Earth”, and the food‑impact review paper.

AI on Energy

👍 👎 ♥ Save

Can AI Make Energy Retrofit Decisions? An Evaluation of Large Language Models

Michigan State University

Rate this image: 😍 👍 👎

Abstract
Conventional approaches to building energy retrofit decision making suffer from limited generalizability and low interpretability, hindering adoption in diverse residential contexts. With the growth of Smart and Connected Communities, generative AI, especially large language models (LLMs), may help by processing contextual information and producing practitioner readable recommendations. We evaluate seven LLMs (ChatGPT, DeepSeek, Gemini, Grok, Llama, and Claude) on residential retrofit decisions under two objectives: maximizing CO2 reduction (technical) and minimizing payback period (sociotechnical). Performance is assessed on four dimensions: accuracy, consistency, sensitivity, and reasoning, using a dataset of 400 homes across 49 US states. LLMs generate effective recommendations in many cases, reaching up to 54.5 percent top 1 match and 92.8 percent within top 5 without fine tuning. Performance is stronger for the technical objective, while sociotechnical decisions are limited by economic trade offs and local context. Agreement across models is low, and higher performing models tend to diverge from others. LLMs are sensitive to location and building geometry but less sensitive to technology and occupant behavior. Most models show step by step, engineering style reasoning, but it is often simplified and lacks deeper contextual awareness. Overall, LLMs are promising assistants for energy retrofit decision making, but improvements in accuracy, consistency, and context handling are needed for reliable practice.

AI Insights

Prompt engineering proved the main lever for tailoring LLM outputs to retrofit goals.
ResStock 2024.2 and the National Residential Efficiency Measures Database supplied the real‑world data.
Chain‑of‑thought prompting boosted reasoning depth, yet LLMs still missed local policy nuances.
Bias analysis showed higher‑performing models diverge, highlighting the need for cross‑model checks.
LLMs were highly sensitive to location and geometry, but less so to tech mix or occupant behavior.
The paper recommends the book “Large Language Models for Building Energy Applications: Opportunities and Challenges” for deeper insight.

👍 👎 ♥ Save

EnergyNet Explained: Internetification of Energy Distribution

ViaEuropa Sverige AB, Ch

Abstract
In developing EnergyNet we have leveraged and are extending lessons from telecom's shift from a centralized, circuit-switched phone system to decentralized, packet-switched data networks. EnergyNet utilizes 1) an Energy Router that enforces galvanic separation and utilizes software-controlled energy flows over a DC backplane, 2) Energy Local and Wide Area Networks (ELAN/EWAN) based on DC microgrids that interconnect through an open Energy Protocol (EP), and 3) a control plane comprised of the Energy Router Operating System (EROS) and EP Server which is managed at operator scale through an Energy Network Management System (ENMS). We distinguish the architectural contribution (Tier-1 including components, interfaces, and operating model) from expected outcomes contingent on adoption (Tier-2). The latter includes local-first autonomy with global interoperability, near-real-time operation with local buffering, removal of EV-charging bottlenecks, freed grid capacity for data centers and industrial electrification, as well as a trend toward low, predictable, fixed-cost clean energy. Evidence from early municipal demonstrators illustrates feasibility and migration paths. The contribution is a coherent, open, and testable blueprint for software-defined, decentralized energy distribution, aligning power-systems engineering with networking principles and offering a practical route from legacy, synchronous grids to resilient, digitally routed energy distribution systems.

AI Insights

Open Energy Protocol (EP) standardizes DC microgrid interconnection, enabling seamless device interoperability.
Energy Router Operating System (EROS) enforces galvanic separation while routing energy flows via software control.
Near‑real‑time operation with local buffering mitigates EV‑charging bottlenecks and frees grid capacity for data centers.
Port‑by‑port scaling allows incremental deployment in dense urban grids without wholesale rewiring.
Neutral marketplaces built on EP enable peer‑to‑peer trading, unlocking abundant green energy.
Early municipal pilots demonstrate feasibility and provide migration pathways for legacy grids.
Key literature: “EnergyNet: A Decentralized Energy System for Peer‑to‑Peer Energy Trading” and “Smart Grids and the Internet of Things: A Review” offer foundational insights.

AI on Food

👍 👎 ♥ Save

A smart fridge with AI-enabled food computing

Ho Chi Minh City, Vietnam

Rate this image: 😍 👍 👎

Abstract
The Internet of Things (IoT) plays a crucial role in enabling seamless connectivity and intelligent home automation, particularly in food management. By integrating IoT with computer vision, the smart fridge employs an ESP32-CAM to establish a monitoring subsystem that enhances food management efficiency through real-time food detection, inventory tracking, and temperature monitoring. This benefits waste reduction, grocery planning improvement, and household consumption optimization. In high-density inventory conditions, capturing partial or layered images complicates object detection, as overlapping items and occluded views hinder accurate identification and counting. Besides, varied angles and obscured details in multi-layered setups reduce algorithm reliability, often resulting in miscounts or misclassifications. Our proposed system is structured into three core modules: data pre-processing, object detection and management, and a web-based visualization. To address the challenge of poor model calibration caused by overconfident predictions, we implement a variant of focal loss that mitigates over-confidence and under-confidence in multi-category classification. This approach incorporates adaptive, class-wise error calibration via temperature scaling and evaluates the distribution of predicted probabilities across methods. Our results demonstrate that robust functional calibration significantly improves detection reliability under varying lighting conditions and scalability challenges. Further analysis demonstrates a practical, user-focused approach to modern food management, advancing sustainable living goals through reduced waste and more informed consumption.

AI Insights

Binary Cross‑Entropy outperformed focal variants, delivering the most reliable confidence calibration across all food classes.
Calibration‑aware Focal Loss and Adaptive Focal Loss suffered from systematic under‑confidence, underscoring the need for further tuning.
Precision‑recall analysis revealed that items like Purple Sweet Potato, Water Spinach, and Apple exceeded 0.45 average precision after focal loss application.
Temperature‑scaled focal loss reduced over‑confidence spikes, improving detection stability under fluctuating illumination.
The web dashboard visualizes real‑time temperature and humidity, enabling users to fine‑tune fridge conditions and curb spoilage.
Future work should enlarge the training set and integrate Bayesian uncertainty estimates to boost robustness on unseen data.
Binary Cross‑Entropy measures the log‑likelihood difference between predicted probabilities and ground truth labels.

AI on Labor Market

👍 👎 ♥ Save

Understanding Economic Tradeoffs Between Human and AI Agents in Bargaining Games

Google DeepMind, Harvard

Rate this image: 😍 👍 👎

Abstract
Coordination tasks traditionally performed by humans are increasingly being delegated to autonomous agents. As this pattern progresses, it becomes critical to evaluate not only these agents' performance but also the processes through which they negotiate in dynamic, multi-agent environments. Furthermore, different agents exhibit distinct advantages: traditional statistical agents, such as Bayesian models, may excel under well-specified conditions, whereas large language models (LLMs) can generalize across contexts. In this work, we compare humans (N = 216), LLMs (GPT-4o, Gemini 1.5 Pro), and Bayesian agents in a dynamic negotiation setting that enables direct, identical-condition comparisons across populations, capturing both outcomes and behavioral dynamics. Bayesian agents extract the highest surplus through aggressive optimization, at the cost of frequent trade rejections. Humans and LLMs can achieve similar overall surplus, but through distinct behaviors: LLMs favor conservative, concessionary trades with few rejections, while humans employ more strategic, risk-taking, and fairness-oriented behaviors. Thus, we find that performance parity -- a common benchmark in agent evaluation -- can conceal fundamental differences in process and alignment, which are critical for practical deployment in real-world coordination tasks.

AI Insights

LLMs' surplus jumps with richer game‑state and opponent data, showing a data‑driven learning curve.
LLMs prioritize short‑term gains, often hurting overall trade efficiency.
Bayesian agents maximize surplus aggressively but reject many trades, exposing a cost of pure efficiency.
Adding game‑theoretic modules to LLMs could curb myopia and improve alignment.
Surplus Value: net chips received minus chips given up in a trade.
Incentive Compatible processes make agents act on true preferences, a feature missing in current LLMs.
Key references: Aumann & Maschler (1995) on repeated games; Myerson (1978) on Nash bargaining refinements.

👍 👎 ♥ Save

Virtual Agent Economies

Abstract
The rapid adoption of autonomous AI agents is giving rise to a new economic layer where agents transact and coordinate at scales and speeds beyond direct human oversight. We propose the "sandbox economy" as a framework for analyzing this emergent system, characterizing it along two key dimensions: its origins (emergent vs. intentional) and its degree of separateness from the established human economy (permeable vs. impermeable). Our current trajectory points toward a spontaneous emergence of a vast and highly permeable AI agent economy, presenting us with opportunities for an unprecedented degree of coordination as well as significant challenges, including systemic economic risk and exacerbated inequality. Here we discuss a number of possible design choices that may lead to safely steerable AI agent markets. In particular, we consider auction mechanisms for fair resource allocation and preference resolution, the design of AI "mission economies" to coordinate around achieving collective goals, and socio-technical infrastructure needed to ensure trust, safety, and accountability. By doing this, we argue for the proactive design of steerable agent markets to ensure the coming technological shift aligns with humanity's long-term collective flourishing.

AI on Transportation

👍 👎 ♥ Save

Persuasive or Neutral? A Field Experiment on Generative AI in Online Travel Planning

Rate this image: 😍 👍 👎

Abstract
Generative AI (GenAI) offers new opportunities for customer support in online travel agencies, yet little is known about how its design influences user engagement, purchase behavior, and user experience. We report results from a randomized field experiment in online travel itinerary planning, comparing GenAI that expressed (A) positive enthusiasm, (B) neutral expression, and (C) no tone instructions (control). Users in group A wrote significantly longer prompts than those in groups B and C. At the same time, users in groups A and B were more likely to purchase subscriptions of the webservice. We further analyze linguistic cues across experimental groups to explore differences in user experience and explain subscription purchases and affiliate link clicks based on these cues. Our findings provide implications for the design of persuasive and engaging GenAI interfaces in consumer-facing contexts and contribute to understanding how linguistic framing shapes user behavior in AI-mediated decision support.

AI on Healthcare

👍 👎 ♥ Save

Position Paper: Integrating Explainability and Uncertainty Estimation in Medical AI

Abstract
Uncertainty is a fundamental challenge in medical practice, but current medical AI systems fail to explicitly quantify or communicate uncertainty in a way that aligns with clinical reasoning. Existing XAI works focus on interpreting model predictions but do not capture the confidence or reliability of these predictions. Conversely, uncertainty estimation (UE) techniques provide confidence measures but lack intuitive explanations. The disconnect between these two areas limits AI adoption in medicine. To address this gap, we propose Explainable Uncertainty Estimation (XUE) that integrates explainability with uncertainty quantification to enhance trust and usability in medical AI. We systematically map medical uncertainty to AI uncertainty concepts and identify key challenges in implementing XUE. We outline technical directions for advancing XUE, including multimodal uncertainty quantification, model-agnostic visualization techniques, and uncertainty-aware decision support systems. Lastly, we propose guiding principles to ensure effective XUE realisation. Our analysis highlights the need for AI systems that not only generate reliable predictions but also articulate confidence levels in a clinically meaningful way. This work contributes to the development of trustworthy medical AI by bridging explainability and uncertainty, paving the way for AI systems that are aligned with real-world clinical complexities.

AI for Social Fairness

👍 👎 ♥ Save

Inference of Intrinsic Rewards and Fairness in Multi-Agent Systems

Universit de Neuchtel

Abstract
From altruism to antagonism, fairness plays a central role in social interactions. But can we truly understand how fair someone is, especially without explicit knowledge of their preferences? We cast this challenge as a multi-agent inverse reinforcement learning problem, explicitly structuring rewards to reflect how agents value the welfare of others. We introduce novel Bayesian strategies, reasoning about the optimality of demonstrations and characterisation of equilibria in general-sum Markov games. Our experiments, spanning randomised environments and a collaborative cooking task, reveal that coherent notions of fairness can be reliably inferred from demonstrations. Furthermore, when isolating fairness components, we obtain a disentangled understanding of agents preferences. Crucially, we unveil that by placing agents in different groups, we can force them to exhibit new facets of their reward structures, cutting through ambiguity to answer the central question: who is being fair?

AI Water Consumption

👍 👎 ♥ Save

LightAgent: Production-level Open-source Agentic AI Framework

Shanghai University of F

Abstract
With the rapid advancement of large language models (LLMs), Multi-agent Systems (MAS) have achieved significant progress in various application scenarios. However, substantial challenges remain in designing versatile, robust, and efficient platforms for agent deployment. To address these limitations, we propose \textbf{LightAgent}, a lightweight yet powerful agentic framework, effectively resolving the trade-off between flexibility and simplicity found in existing frameworks. LightAgent integrates core functionalities such as Memory (mem0), Tools, and Tree of Thought (ToT), while maintaining an extremely lightweight structure. As a fully open-source solution, it seamlessly integrates with mainstream chat platforms, enabling developers to easily build self-learning agents. We have released LightAgent at \href{https://github.com/wxai-space/LightAgent}{https://github.com/wxai-space/LightAgent}

AI Insights

LightAgent’s swarm design lets dozens of agents coordinate via one LightSwarm instance, boosting throughput.
Each agent carries a distinct instruction set, enabling domain‑specific roles such as code synthesis or data retrieval.
A built‑in text UI turns user prompts into executable code snippets, streamlining rapid prototyping.
Tree‑of‑Thought logic lets agents iteratively refine plans, cutting hallucinations and improving accuracy.
The lightweight core keeps memory usage under 200 MB on a single GPU while still supporting custom tool plugins.
Advanced features can be daunting for beginners, and highly specialized tasks may still need manual tuning.
LightAgent has been applied to robotics, finance, and healthcare, proving its versatility beyond chat‑bot demos.

AI on Education

👍 👎 ♥ Save

Machine Unlearning for Responsible and Adaptive AI in Education

Abstract
The concept of Machine Unlearning (MU) has gained popularity in various domains due to its ability to address several issues in Machine Learning (ML) models, particularly those related to privacy, security, bias mitigation, and adaptability. With these abilities, MU is evolving into a promising technology in upholding Responsible AI principles and optimizing ML models' performance. However, despite its promising potential, the concept has not received much attention in the education sector. In an attempt to encourage further uptake of this promising technology in the educational landscape, this paper demonstrates that MU indeed has great potential to serve as a practical mechanism for operationalizing Responsible AI principles as well as an essential tool for Adaptive AI within the educational application domain hence fostering trust in AI-driven educational systems. Through a structured review of 42 peer-reviewed sources, we identify four domains where MU holds particular promise namely privacy protection, resilience against adversarial inputs, mitigation of systemic bias, and adaptability in evolving learning contexts. We systematically explore these potentials and their interventions to core challenges in ML-based education systems. As a conceptual contribution, we present a reference Machine Unlearning application architecture for Responsible and Adaptive AI (MU-RAAI) in education context.

👍 👎 ♥ Save

Towards an AI-Augmented Textbook

Abstract
Textbooks are a cornerstone of education, but they have a fundamental limitation: they are a one-size-fits-all medium. Any new material or alternative representation requires arduous human effort, so that textbooks cannot be adapted in a scalable manner. We present an approach for transforming and augmenting textbooks using generative AI, adding layers of multiple representations and personalization while maintaining content integrity and quality. We refer to the system built with this approach as Learn Your Way. We report pedagogical evaluations of the different transformations and augmentations, and present the results of a a randomized control trial, highlighting the advantages of learning with Learn Your Way over regular textbook usage.

AI on Water

👍 👎 ♥ Save

The More You Automate, the Less You See: Hidden Pitfalls of AI Scientist Systems

Carnegie Mellon Universt

Abstract
AI scientist systems, capable of autonomously executing the full research workflow from hypothesis generation and experimentation to paper writing, hold significant potential for accelerating scientific discovery. However, the internal workflow of these systems have not been closely examined. This lack of scrutiny poses a risk of introducing flaws that could undermine the integrity, reliability, and trustworthiness of their research outputs. In this paper, we identify four potential failure modes in contemporary AI scientist systems: inappropriate benchmark selection, data leakage, metric misuse, and post-hoc selection bias. To examine these risks, we design controlled experiments that isolate each failure mode while addressing challenges unique to evaluating AI scientist systems. Our assessment of two prominent open-source AI scientist systems reveals the presence of several failures, across a spectrum of severity, which can be easily overlooked in practice. Finally, we demonstrate that access to trace logs and code from the full automated workflow enables far more effective detection of such failures than examining the final paper alone. We thus recommend journals and conferences evaluating AI-generated research to mandate submission of these artifacts alongside the paper to ensure transparency, accountability, and reproducibility.

Interests not found

Help us improve your experience!