Hi!

Your personalized paper recommendations for 05 to 09 January, 2026.
University of Cambridge
Rate paper: 👍 👎 ♥ Save
Abstract
The era of AI regulation (AIR) is upon us. But AI systems, we argue, will not be able to comply with these regulations at the necessary speed and scale by continuing to rely on traditional, analogue methods of compliance. Instead, we posit that compliance with these regulations will only realistically be achieved computationally: that is, with algorithms that run across the life cycle of an AI system, automatically steering it toward AIR compliance in the face of dynamic conditions. Yet despite their (we would argue) inevitability, the research community has yet to specify exactly how these algorithms for computational AIR compliance should behave - or how we should benchmark their performance. To fill these gaps, we specify a set of design goals for such algorithms. In addition, we specify a benchmark dataset that can be used to quantitatively measure whether individual algorithms satisfy these design goals. By delivering this blueprint, we hope to give shape to an important but uncrystallized new domain of research - and, in doing so, incite necessary investment in it.
Why we are recommending this paper?
Due to your Interest in AI for Compliance

This paper directly addresses the need for new approaches to AI regulation, aligning with your interest in AI governance and compliance. It proposes a novel research domain focused on how AI systems can actually meet regulatory requirements, a critical area for your work.
KAIST
Rate paper: 👍 👎 ♥ Save
Paper visualization
Rate image: 👍 👎
AI Insights
  • The patient has a firm, irregular mass in the left breast and adjacent axillary region. [3]
  • The mass is firm and irregular, with a severity of 4/10 (mild pain but significant anxiety). [3]
  • There is mild nipple discharge and slight fatigue associated with the mass. [3]
  • There is no family history of breast cancer or similar cancers. [3]
  • The patient has not had any previous history of breast-related surgery or conditions. [3]
  • The patient's lifestyle has not changed recently, and she aims for early detection through screening. [2]
  • The duration of the mass is approximately 3 months, with slight variations with menstrual cycle. [1]
Abstract
We introduce EPAG, a benchmark dataset and framework designed for Evaluating the Pre-consultation Ability of LLMs using diagnostic Guidelines. LLMs are evaluated directly through HPI-diagnostic guideline comparison and indirectly through disease diagnosis. In our experiments, we observe that small open-source models fine-tuned with a well-curated, task-specific dataset can outperform frontier LLMs in pre-consultation. Additionally, we find that increased amount of HPI (History of Present Illness) does not necessarily lead to improved diagnostic performance. Further experiments reveal that the language of pre-consultation influences the characteristics of the dialogue. By open-sourcing our dataset and evaluation pipeline on https://github.com/seemdog/EPAG, we aim to contribute to the evaluation and further development of LLM applications in real-world clinical settings.
Why we are recommending this paper?
Due to your Interest in LLMs for Compliance

This research focuses on evaluating LLMs using diagnostic guidelines, a key aspect of ensuring responsible AI development and deployment. The framework presented offers a practical approach to assessing LLM capabilities relevant to your interest in LLMs for compliance.
University of Luxembourg
Rate paper: 👍 👎 ♥ Save
Abstract
The EU AI Act adopts a horizontal and adaptive approach to govern AI technologies characterised by rapid development and unpredictable emerging capabilities. To maintain relevance, the Act embeds provisions for regulatory learning. However, these provisions operate within a complex network of actors and mechanisms that lack a clearly defined technical basis for scalable information flow. This paper addresses this gap by establishing a theoretical model of regulatory learning space defined by the AI Act, decomposed into micro, meso, and macro levels. Drawing from this functional perspective of this model, we situate the diverse stakeholders - ranging from the EU Commission at the macro level to AI developers at the micro level - within the transitions of enforcement (macro-micro) and evidence aggregation (micro-macro). We identify AI Technical Sandboxes as the essential engine for evidence generation at the micro level, providing the necessary data to drive scalable learning across all levels of the model. By providing an extensive discussion of the requirements and challenges for AITSes to serve as this micro-level evidence generator, we aim to bridge the gap between legislative commands and technical operationalisation, thereby enabling a structured discourse between technical and legal experts.
Why we are recommending this paper?
Due to your Interest in AI Governance

This paper examines the EU AI Act and regulatory learning, directly addressing the complexities of AI governance frameworks. Given your interest in AI governance, this work provides valuable insights into the evolving regulatory landscape.
Institute for Applied Economic Research IPEA Brazil
Rate paper: 👍 👎 ♥ Save
Abstract
This paper examines the European Union's emerging regulatory landscape - focusing on the AI Act, corporate sustainability reporting and due diligence regimes (CSRD and CSDDD), and data center regulation - to assess whether it can effectively govern AI's environmental footprint. We argue that, despite incremental progress, current approaches remain ill-suited to correcting the market failures underpinning AI-related energy use, water consumption, and material demand. Key shortcomings include narrow disclosure requirements, excessive reliance on voluntary standards, weak enforcement mechanisms, and a structural disconnect between AI-specific impacts and broader sustainability laws. The analysis situates these regulatory gaps within a wider ecosystem of academic research, civil society advocacy, standard-setting, and industry initiatives, highlighting risks of regulatory capture and greenwashing. Building on this diagnosis, the paper advances strategic recommendations for the COP30 Action Agenda, calling for binding transparency obligations, harmonized international standards for lifecycle assessment, stricter governance of data center expansion, and meaningful public participation in AI infrastructure decisions.
Why we are recommending this paper?
Due to your Interest in AI Governance

This paper investigates the limitations of current regulations in addressing AI's environmental impact, a growing concern within the broader AI governance discussion. It’s a relevant exploration of the challenges in achieving sustainable AI practices.
Honda Research Institute Europe GmbH
Rate paper: 👍 👎 ♥ Save
Paper visualization
Rate image: 👍 👎
Abstract
Large language models (LLMs) are increasingly used as coding partners, yet their role in accelerating scientific discovery remains underexplored. This paper presents a case study of using ChatGPT for rapid prototyping in ESA's ELOPE (Event-based Lunar OPtical flow Egomotion estimation) competition. The competition required participants to process event camera data to estimate lunar lander trajectories. Despite joining late, we achieved second place with a score of 0.01282, highlighting the potential of human-AI collaboration in competitive scientific settings. ChatGPT contributed not only executable code but also algorithmic reasoning, data handling routines, and methodological suggestions, such as using fixed number of events instead of fixed time spans for windowing. At the same time, we observed limitations: the model often introduced unnecessary structural changes, gets confused by intermediate discussions about alternative ideas, occasionally produced critical errors and forgets important aspects in longer scientific discussions. By analyzing these strengths and shortcomings, we show how conversational AI can both accelerate development and support conceptual insight in scientific research. We argue that structured integration of LLMs into the scientific workflow can enhance rapid prototyping by proposing best practices for AI-assisted scientific work.
Why we are recommending this paper?
Due to your Interest in Chat Designers

This case study explores the use of conversational AI (specifically ChatGPT) in scientific prototyping, aligning with your interest in LLMs and their application in design. The ESA ELOPE competition provides a concrete example of AI's potential in a specialized domain.
KASIKORN BusinessTechnology Group
Rate paper: 👍 👎 ♥ Save
Abstract
Large Language Models (LLMs) have demonstrated significant potential across various domains, particularly in banking and finance, where they can automate complex tasks and enhance decision-making at scale. Due to privacy, security, and regulatory concerns, organizations often prefer on-premise deployment of LLMs. The ThaiLLM initiative aims to enhance Thai language capabilities in open-LLMs, enabling Thai industry to leverage advanced language models. However, organizations often face a trade-off between deploying multiple specialized models versus the prohibitive expense of training a single multi-capability model. To address this, we explore model merging as a resource-efficient alternative for developing high-performance, multi-capability LLMs. We present results from two key experiments: first, merging Qwen-8B with ThaiLLM-8B demonstrates how ThaiLLM-8B enhances Thai general capabilities, showing an uplift of M3 and M6 O-NET exams over the general instruction-following Qwen-8B. Second, we merge Qwen-8B with both ThaiLLM-8B and THaLLE-CFA-8B. This combination results in further improvements in performance across both general and financial domains, by demonstrating an uplift in both M3 and M6 O-NET, Flare-CFA, and Thai-IC benchmarks. The report showcases the viability of model merging for efficiently creating multi-capability LLMs.
Why we are recommending this paper?
Due to your Interest in LLMs for Compliance
University of Maryland, Baltimore County
Rate paper: 👍 👎 ♥ Save
Abstract
Automated interviewing tools are now widely adopted to manage recruitment at scale, often replacing early human screening with algorithmic assessments. While these systems are promoted as efficient and consistent, they also generate new forms of uncertainty for applicants. Efforts to soften these experiences through human-like design features have only partially addressed underlying concerns. To understand how candidates interpret and cope with such systems, we conducted a mixed empirical investigation that combined analysis of online discussions, responses from more than one hundred and fifty survey participants, and follow-up conversations with seventeen interviewees. The findings point to several recurring problems, including unclear evaluation criteria, limited organizational responsibility for automated outcomes, and a lack of practical support for preparation. Many participants described the technology as far less advanced than advertised, leading them to infer how decisions might be made in the absence of guidance. This speculation often intensified stress and emotional strain. Furthermore, the minimal sense of interpersonal engagement contributed to feelings of detachment and disposability. Based on these observations, we propose design directions aimed at improving clarity, accountability, and candidate support in AI-mediated hiring processes.
Why we are recommending this paper?
Due to your Interest in Chat Designers