Human in the Loop

Selective Progress-Aware Querying for Human-in-the-Loop Reinforcement Learning

Abstract
Human feedback can greatly accelerate robot learning, but in real-world settings, such feedback is costly and limited. Existing human-in-the-loop reinforcement learning (HiL-RL) methods often assume abundant feedback, limiting their practicality for physical robot deployment. In this work, we introduce SPARQ, a progress-aware query policy that requests feedback only when learning stagnates or worsens, thereby reducing unnecessary oracle calls. We evaluate SPARQ on a simulated UR5 cube-picking task in PyBullet, comparing against three baselines: no feedback, random querying, and always querying. Our experiments show that SPARQ achieves near-perfect task success, matching the performance of always querying while consuming about half the feedback budget. It also provides more stable and efficient learning than random querying, and significantly improves over training without feedback. These findings suggest that selective, progress-based query strategies can make HiL-RL more efficient and scalable for robots operating under realistic human effort constraints.

Best practices for human in the loop

👍 👎 ♥ Save

Cascade! Human in the loop shortcomings can increase the risk of failures in recommender systems

Abstract
Recommender systems are among the most commonly deployed systems today. Systems design approaches to AI-powered recommender systems have done well to urge recommender system developers to follow more intentional data collection, curation, and management procedures. So too has the "human-in-the-loop" paradigm been widely adopted, primarily to address the issue of accountability. However, in this paper, we take the position that human oversight in recommender system design also entails novel risks that have yet to be fully described. These risks are "codetermined" by the information context in which such systems are often deployed. Furthermore, new knowledge of the shortcomings of "human-in-the-loop" practices to deliver meaningful oversight of other AI systems suggest that they may also be inadequate for achieving socially responsible recommendations. We review how the limitations of human oversight may increase the chances of a specific kind of failure: a "cascade" or "compound" failure. We then briefly explore how the unique dynamics of three common deployment contexts can make humans in the loop more likely to fail in their oversight duties. We then conclude with two recommendations.

Human in the loop platforms

👍 👎 ♥ Save

Human-like Navigation in a World Built for Humans

University of Illinois at

Abstract
When navigating in a man-made environment they haven't visited before--like an office building--humans employ behaviors such as reading signs and asking others for directions. These behaviors help humans reach their destinations efficiently by reducing the need to search through large areas. Existing robot navigation systems lack the ability to execute such behaviors and are thus highly inefficient at navigating within large environments. We present ReasonNav, a modular navigation system which integrates these human-like navigation skills by leveraging the reasoning capabilities of a vision-language model (VLM). We design compact input and output abstractions based on navigation landmarks, allowing the VLM to focus on language understanding and reasoning. We evaluate ReasonNav on real and simulated navigation tasks and show that the agent successfully employs higher-order reasoning to navigate efficiently in large, complex buildings.

AI Insights

ReasonNav’s VLM, trained on multimodal data, infers spatial relations from images and text.
It detects elevators, presses buttons with a robotic arm, and senses door opening.
Landmark tokens compactly encode visual cues, cutting perception‑to‑action latency.
The decision module fuses visual confidence with linguistic priors for robust navigation.
In a 10‑floor office, ReasonNav cut path length by 35 % versus SLAM‑only agents.
Low‑light or cluttered scenes lower VLM confidence, raising arm‑control errors.
Future work fine‑tunes VLMs and trains end‑to‑end policies for unseen layouts.

👍 👎 ♥ Save

humancompatible.train: Implementing Optimization Algorithms for Stochastically-Constrained Stochastic Optimization Problems

Abstract
There has been a considerable interest in constrained training of deep neural networks (DNNs) recently for applications such as fairness and safety. Several toolkits have been proposed for this task, yet there is still no industry standard. We present humancompatible.train (https://github.com/humancompatible/train), an easily-extendable PyTorch-based Python package for training DNNs with stochastic constraints. We implement multiple previously unimplemented algorithms for stochastically constrained stochastic optimization. We demonstrate the toolkit use by comparing two algorithms on a deep learning task with fairness constraints.

AI Agents

👍 👎 ♥ Save

The STAR-XAI Protocol: An Interactive Framework for Inducing Second-Order Agency in AI Agents

Ixent Games

Abstract
Current Large Reasoning Models (LRMs) exhibit significant limitations in reliability and transparency, often showing a collapse in reasoning capabilities when faced with high-complexity, long-horizon tasks. This "illusion of thinking" is frequently an artifact of non-agentic, black-box evaluation paradigms that fail to cultivate robust problem-solving processes. In response, we introduce The STAR-XAI Protocol (Socratic, Transparent, Agentic, Reasoning - for eXplainable Artificial Intelligence), a novel methodology for training and operating verifiably reliable AI agents. Our method reframes the human-AI interaction as a structured, Socratic dialogue, governed by an explicit and evolving rulebook, the Consciousness Transfer Package (CTP). Through an interactive Gameplay Cycle that enforces ante-hoc strategic justification and a state-locking Checksum that prevents error accumulation, the protocol transforms a powerful but opaque LRM into a disciplined "Clear Box" agent. We demonstrate the efficacy of this method through an exhaustive 25-move case study in the complex strategic game "Caps i Caps". The agent not only solved the high-complexity puzzle but also demonstrated Second-Order Agency, identifying flaws in its own supervisor-approved plans and adapting its core integrity protocols mid-task. The STAR-XAI Protocol offers a practical pathway to creating AI agents that are not just high-performing, but also transparent, auditable, and trustworthy by design.

AI Insights

The Consciousness Transfer Package (CTP) is a step‑by‑step manual for mastering gear‑based board games, covering placement, rotation, and vector mechanics.
CTP provides concrete examples of successful moves, letting Gems learn proven strategies instead of trial‑and‑error.
The package is designed for seamless handoff, so one Gem can train an agent that another can use without knowledge loss.
Recommended literature includes reasoning classics, game‑theory treatises, and studies on gear‑placement efficiency.
Online forums and simulation tools are highlighted as practical resources for testing and refining gear‑game tactics.
A caveat: the CTP’s depth may overwhelm novices and assumes baseline familiarity with gear‑game mechanics.
Core definitions—gear, placement, rotation, vector, base—ensure consistent terminology across training sessions.

👍 👎 ♥ Save

Socio-Economic Model of AI Agents

Abstract
Modern socio-economic systems are undergoing deep integration with artificial intelligence technologies. This paper constructs a heterogeneous agent-based modeling framework that incorporates both human workers and autonomous AI agents, to study the impact of AI collaboration under resource constraints on aggregate social output. We build five progressively extended models: Model 1 serves as the baseline of pure human collaboration; Model 2 introduces AI as collaborators; Model 3 incorporates network effects among agents; Model 4 treats agents as independent producers; and Model 5 integrates both network effects and independent agent production. Through theoretical derivation and simulation analysis, we find that the introduction of AI agents can significantly increase aggregate social output. When considering network effects among agents, this increase exhibits nonlinear growth far exceeding the simple sum of individual contributions. Under the same resource inputs, treating agents as independent producers provides higher long-term growth potential; introducing network effects further demonstrates strong characteristics of increasing returns to scale.

AI and Society

👍 👎 ♥ Save

The three main doctrines on the future of AI

University of Buenos Ares

Rate this image: 😍 👍 👎

Abstract
This paper develops a taxonomy of expert perspectives on the risks and likely consequences of artificial intelligence, with particular focus on Artificial General Intelligence (AGI) and Artificial Superintelligence (ASI). Drawing from primary sources, we identify three predominant doctrines: (1) The dominance doctrine, which predicts that the first actor to create sufficiently advanced AI will attain overwhelming strategic superiority sufficient to cheaply neutralize its opponents' defenses; (2) The extinction doctrine, which anticipates that humanity will likely lose control of ASI, leading to the extinction of the human species or its permanent disempowerment; (3) The replacement doctrine, which forecasts that AI will automate a large share of tasks currently performed by humans, but will not be so transformative as to fundamentally reshape or bring an end to human civilization. We examine the assumptions and arguments underlying each doctrine, including expectations around the pace of AI progress and the feasibility of maintaining advanced AI under human control. While the boundaries between doctrines are sometimes porous and many experts hedge across them, this taxonomy clarifies the core axes of disagreement over the anticipated scale and nature of the consequences of AI development.

Research Automation with AI

👍 👎 ♥ Save

AutoClimDS: Climate Data Science Agentic AI -- A Knowledge Graph is All You Need

Rate this image: 😍 👍 👎

Abstract
Climate data science faces persistent barriers stemming from the fragmented nature of data sources, heterogeneous formats, and the steep technical expertise required to identify, acquire, and process datasets. These challenges limit participation, slow discovery, and reduce the reproducibility of scientific workflows. In this paper, we present a proof of concept for addressing these barriers through the integration of a curated knowledge graph (KG) with AI agents designed for cloud-native scientific workflows. The KG provides a unifying layer that organizes datasets, tools, and workflows, while AI agents -- powered by generative AI services -- enable natural language interaction, automated data access, and streamlined analysis. Together, these components drastically lower the technical threshold for engaging in climate data science, enabling non-specialist users to identify and analyze relevant datasets. By leveraging existing cloud-ready API data portals, we demonstrate that "a knowledge graph is all you need" to unlock scalable and agentic workflows for scientific inquiry. The open-source design of our system further supports community contributions, ensuring that the KG and associated tools can evolve as a shared commons. Our results illustrate a pathway toward democratizing access to climate data and establishing a reproducible, extensible framework for human--AI collaboration in scientific research.

👍 👎 ♥ Save

Responsible AI Technical Report

KT

Abstract
KT developed a Responsible AI (RAI) assessment methodology and risk mitigation technologies to ensure the safety and reliability of AI services. By analyzing the Basic Act on AI implementation and global AI governance trends, we established a unique approach for regulatory compliance and systematically identify and manage all potential risk factors from AI development to operation. We present a reliable assessment methodology that systematically verifies model safety and robustness based on KT's AI risk taxonomy tailored to the domestic environment. We also provide practical tools for managing and mitigating identified AI risks. With the release of this report, we also release proprietary Guardrail : SafetyGuard that blocks harmful responses from AI models in real-time, supporting the enhancement of safety in the domestic AI development ecosystem. We also believe these research outcomes provide valuable insights for organizations seeking to develop Responsible AI.

AI Insights

The risk taxonomy categorizes threats into data, model, deployment, and societal dimensions, each with measurable indicators.
A multi‑stage assessment pipeline integrates static code analysis, adversarial testing, and human‑in‑the‑loop audits to quantify robustness.
SafetyGuard employs a lightweight transformer‑based policy network that intercepts outputs in real time, achieving <5 ms latency on edge devices.
Compliance mapping aligns each risk factor with specific clauses of the Basic Act on AI, enabling automated audit reports.
Pilot deployments in Korean telecom and finance sectors demonstrated a 30 % reduction in policy‑violating incidents after Guardrail integration.
The report proposes a future research agenda on explainable mitigation strategies and cross‑border data‑sharing protocols.

AGI: Artificial General Intelligence

👍 👎 ♥ Save

Limitations on Safe, Trusted, Artificial General Intelligence

Abstract
Safety, trust and Artificial General Intelligence (AGI) are aspirational goals in artificial intelligence (AI) systems, and there are several informal interpretations of these notions. In this paper, we propose strict, mathematical definitions of safety, trust, and AGI, and demonstrate a fundamental incompatibility between them. We define safety of a system as the property that it never makes any false claims, trust as the assumption that the system is safe, and AGI as the property of an AI system always matching or exceeding human capability. Our core finding is that -- for our formal definitions of these notions -- a safe and trusted AI system cannot be an AGI system: for such a safe, trusted system there are task instances which are easily and provably solvable by a human but not by the system. We note that we consider strict mathematical definitions of safety and trust, and it is possible for real-world deployments to instead rely on alternate, practical interpretations of these notions. We show our results for program verification, planning, and graph reachability. Our proofs draw parallels to G\"odel's incompleteness theorems and Turing's proof of the undecidability of the halting problem, and can be regarded as interpretations of G\"odel's and Turing's results.

Deep Learning

👍 👎 ♥ Save

Algorithms for Adversarially Robust Deep Learning

University of Pennsylvann

Rate this image: 😍 👍 👎

Abstract
Given the widespread use of deep learning models in safety-critical applications, ensuring that the decisions of such models are robust against adversarial exploitation is of fundamental importance. In this thesis, we discuss recent progress toward designing algorithms that exhibit desirable robustness properties. First, we discuss the problem of adversarial examples in computer vision, for which we introduce new technical results, training paradigms, and certification algorithms. Next, we consider the problem of domain generalization, wherein the task is to train neural networks to generalize from a family of training distributions to unseen test distributions. We present new algorithms that achieve state-of-the-art generalization in medical imaging, molecular identification, and image classification. Finally, we study the setting of jailbreaking large language models (LLMs), wherein an adversarial user attempts to design prompts that elicit objectionable content from an LLM. We propose new attacks and defenses, which represent the frontier of progress toward designing robust language-based agents.

AI Insights

Random erasing data augmentation injects stochastic occlusions during training, boosting pixel‑level robustness.
Stability training enforces Lipschitz continuity across layers, yielding provable robustness margins.
Robust prompt optimization tailors LLM inputs to shrink jailbreak‑induced decision space.
Universal adversarial attacks generate a single perturbation that transfers across many inputs, breaking input‑specific defenses.
Randomness in SGD can amplify or dampen adversarial vulnerability, depending on learning‑rate schedules.
Tooling—automated augmentation pipelines and reproducibility frameworks—drives consistent robustness across labs.
“Essentials of Robust Control” links classical control theory to deep learning, providing a rigorous basis for safe neural systems.

👍 👎 ♥ Save

Development of Deep Learning Optimizers: Approaches, Concepts, and Update Rules

Istanbul Medeniyet Univer

Abstract
Deep learning optimizers are optimization algorithms that enable deep neural networks to learn. The effectiveness of learning is highly dependent on the optimizer employed in the training process. Alongside the rapid advancement of deep learning, a wide range of optimizers with different approaches have been developed. This study aims to provide a review of various optimizers that have been proposed and received attention in the literature. From Stochastic gradient descent to the most recent ones such as Momentum, AdamW, Sophia, and Muon in chronological order, optimizers are examined individually, and their distinctive features are highlighted in the study. The update rule of each optimizer is presented in detail, with an explanation of the associated concepts and variables. The techniques applied by these optimizers, their contributions to the optimization process, and their default hyperparameter settings are also discussed. In addition, insights are offered into the open challenges encountered in the optimization of deep learning models. Thus, a comprehensive resource is provided both for understanding the current state of optimizers and for identifying potential areas of future development.

Help us improve your experience!