Distributed Systems

Decentralized Multi-Agent System with Trust-Aware Communication

Hiroshima University

Rate paper: 👍 👎 ♥ Save

Abstract
The emergence of Large Language Models (LLMs) is rapidly accelerating the development of autonomous multi-agent systems (MAS), paving the way for the Internet of Agents. However, traditional centralized MAS architectures present significant challenges, including single points of failure, vulnerability to censorship, inherent scalability limitations, and critical trust issues. We propose a novel Decentralized Multi-Agent System (DMAS) architecture designed to overcome these fundamental problems by enabling trust-aware, scalable, and censorship-resistant interactions among autonomous agents. Our DMAS features a decentralized agent runtime underpinned by a blockchain-based architecture. We formalize a trust-aware communication protocol that leverages cryptographic primitives and on-chain operations to provide security properties: verifiable interaction cycles, communication integrity, authenticity, non-repudiation, and conditional confidentiality, which we further substantiate through a comprehensive security analysis. Our performance analysis validates the DMAS as a scalable and efficient solution for building trustworthy multi-agent systems.

AI Summary

The performance analysis demonstrates the DMAS's practical scalability and efficiency, underscoring its potential for fostering a trustworthy and scalable Internet of Agents. [3]
The DMAS is a significant step towards creating a trustworthy and scalable Internet of Agents, which can have far-reaching implications for various industries and applications. [3]
The Decentralized Multi-Agent System (DMAS) is a novel architectural paradigm designed to overcome the limitations of centralized multi-agent systems. [2]
The DMAS integrates a decentralized agent runtime and a trust-aware communication protocol, ensuring message integrity, agent authenticity, non-repudiation, and confidential data exchange. [1]

A Lyapunov-based MPC for Distributed Multi Agent Systems with Time Delays and Packet Dropouts using Hidden Markov Models

German University in Ca

Rate paper: 👍 👎 ♥ Save

Abstract
We propose a SCHMM LMPC framework, integrating Semi Continuous Hidden Markov Models with Lyapunov based Model Predictive Control, for distributed optimal control of multi agent systems under network imperfections. The SCHMM captures the stochastic network behavior in real time, while LMPC ensures consensus and optimality via Linear Matrix Inequalities LMIs. The developed optimal control problem simultaneously minimizes three elements. First, the control effort is reduced to avoid aggressive inputs and second, the network induced error caused by time delays and packet dropouts. Third, the topology-induced error, as the distributed graph restricts agents access to global information. This error is inherent to the communication graph and cannot be addressed through offline learning. To overcome this, the study also introduces the incremental Expectation Maximization EM algorithm, enabling online learning of the SCHMM. This adaptation allows the framework to mitigate both network and topology errors while maintaining optimality through MPC. Simulations validate the effectiveness of the proposed SCHMM LMPC, demonstrating adaptability in multi agent systems with diverse topologies.

Resilience

Robustness and resilience of dynamical networks in biology and epidemiology

University of Trento

Rate paper: 👍 👎 ♥ Save

Rate image: 👍 👎

Abstract
Natural systems are remarkably robust and resilient, maintaining essential functions despite variability, uncertainty, and hostile conditions. Understanding these nonlinear, dynamic behaviours is challenging because such systems involve many interacting parameters, yet it is crucial for explaining processes from cellular regulation to disease onset and epidemic spreading. Robustness and resilience describe a system's ability to preserve and recover desired behaviours in the presence of intrinsic and extrinsic fluctuations. This survey reviews how different disciplines define these concepts, examines methods for assessing whether key properties of uncertain, networked dynamical systems are structural (parameter-free) or robust (preserved for parameter variations within an uncertainty bounding set), and discusses integrated structural and probabilistic techniques for biological and epidemiological models. The text introduces formal definitions of resilience for families of systems obtained by adding stochastic perturbations to a nominal deterministic model, enabling a probabilistic characterisation of the ability to remain within or return to a prescribed attractor. These definitions generalise probabilistic robustness and shed new light on classical biological examples. In addition, the survey summarises resilience indicators and data-driven tools for detecting resilience loss and regime shifts, drawing on bifurcation analysis to anticipate qualitative changes in system behaviour. Together, these methodologies support the study and control of complex natural systems, guiding the design of biomolecular feedback architectures, the identification of therapeutic targets, the forecasting and management of epidemics, and the detection of tipping points in ecological and biological networks.

AI Summary

Graphs are versatile tools to describe complex interplay of interactions giving rise to biological and epidemiological systems. [3]
Hyper-graphs and hyper-edges represent flow graphs for chemical reactions and can be used to capture the topology of biological networks. [3]
Multi-scale models bridge the microscopic in-host scale and the macroscopic between-host scale, studying how immunological mechanisms affect epidemiological mechanisms. [2]
Epidemic models can be interpreted as chemical reaction systems using mass action kinetics. [1]

High throughput

PystachIO: Efficient Distributed GPU Query Processing with PyTorch over Fast Networks & Fast Storage

TU Darmstadt

Rate paper: 👍 👎 ♥ Save

Abstract
The AI hardware boom has led modern data centers to adopt HPC-style architectures centered on distributed, GPU-centric computation. Large GPU clusters interconnected by fast RDMA networks and backed by high-bandwidth NVMe storage enable scalable computation and rapid access to storage-resident data. Tensor computation runtimes (TCRs), such as PyTorch, originally designed for AI workloads, have recently been shown to accelerate analytical workloads. However, prior work has primarily considered settings where the data fits in aggregated GPU memory. In this paper, we systematically study how TCRs can support scalable, distributed query processing for large-scale, storage-resident OLAP workloads. Although TCRs provide abstractions for network and storage I/O, naive use often underutilizes GPU and I/O bandwidth due to insufficient overlap between computation and data movement. As a core contribution, we present PystachIO, a PyTorch-based distributed OLAP engine that combines fast network and storage I/O with key optimizations to maximize GPU, network, and storage utilization. Our evaluation shows up to 3x end-to-end speedups over existing distributed GPU-based query processing approaches.

AI Summary

The TCR is designed to handle complex queries and provide high performance, while the storage engine is optimized for GPU acceleration. [3]
Table-centric runtime (TCR): A system designed to handle complex queries and provide high performance by leveraging GPU acceleration. [3]
The system's ability to scale with increasing SSDs and dataset sizes makes it suitable for large-scale data processing applications. [3]
PystachIO's use of TCR provides high performance and efficiency in handling complex queries. [3]
The paper discusses a new system called PystachIO that integrates a table-centric runtime (TCR) with a storage engine for efficient processing of large-scale datasets. [2]

AI Agents

STRIDE: A Systematic Framework for Selecting AI Modalities -- Agentic AI, AI Assistants, or LLM Calls

IBM

Rate paper: 👍 👎 ♥ Save

Abstract
The rapid shift from stateless large language models (LLMs) to autonomous, goal-driven agents raises a central question: When is agentic AI truly necessary? While agents enable multi-step reasoning, persistent memory, and tool orchestration, deploying them indiscriminately leads to higher cost, complexity, and risk. We present STRIDE (Systematic Task Reasoning Intelligence Deployment Evaluator), a framework that provides principled recommendations for selecting between three modalities: (i) direct LLM calls, (ii) guided AI assistants, and (iii) fully autonomous agentic AI. STRIDE integrates structured task decomposition, dynamism attribution, and self-reflection requirement analysis to produce an Agentic Suitability Score, ensuring that full agentic autonomy is reserved for tasks with inherent dynamism or evolving context. Evaluated across 30 real-world tasks spanning SRE, compliance, and enterprise automation, STRIDE achieved 92% accuracy in modality selection, reduced unnecessary agent deployments by 45%, and cut resource costs by 37%. Expert validation over six months in SRE and compliance domains confirmed its practical utility, with domain specialists agreeing that STRIDE effectively distinguishes between tasks requiring simple LLM calls, guided assistants, or full agentic autonomy. This work reframes agent adoption as a necessity-driven design decision, ensuring autonomy is applied only when its benefits justify the costs.

AI Summary

The framework can be used in conjunction with existing benchmarks to evaluate the performance of agentic AI systems. [3]
Future extensions to STRIDE will include multimodal tasks, reinforcement learning for weight tuning, and validation at enterprise scale. [3]
STRIDE's scoring functions are heuristic by design, striking a balance between interpretability and generality. [3]
STRIDE (Systematic Task Reasoning Intelligence Deployment Evaluator) is a framework that determines when tasks require agentic AI, AI assistants, or simple LLM calls. [2]
STRIDE integrates five analytical dimensions: structured task decomposition, dynamic reasoning and tool-interaction scoring, dynamism attribution analysis, self-reflection requirement assessment, and agentic suitability inference. [1]

Self-Improving AI Agents through Self-Play

ulamai

Rate paper: 👍 👎 ♥ Save

Abstract
We extend the moduli-theoretic framework of psychometric batteries to the domain of dynamical systems. While previous work established the AAI capability score as a static functional on the space of agent representations, this paper formalizes the agent as a flow $ν_r$ parameterized by computational resource $r$, governed by a recursive Generator-Verifier-Updater (GVU) operator. We prove that this operator generates a vector field on the parameter manifold $Θ$, and we identify the coefficient of self-improvement $κ$ as the Lie derivative of the capability functional along this flow. The central contribution of this work is the derivation of the Variance Inequality, a spectral condition that is sufficient (under mild regularity) for the stability of self-improvement. We show that a sufficient condition for $κ> 0$ is that, up to curvature and step-size effects, the combined noise of generation and verification must be small enough. We then apply this formalism to unify the recent literature on Language Self-Play (LSP), Self-Correction, and Synthetic Data bootstrapping. We demonstrate that architectures such as STaR, SPIN, Reflexion, GANs and AlphaZero are specific topological realizations of the GVU operator that satisfy the Variance Inequality through filtration, adversarial discrimination, or grounding in formal systems.

AI Summary

The GVU framework is used to analyze the stability of self-improvement in AI systems. [3]
The Variance Inequality (Theorem 4.1) provides a sufficient condition for stable self-improvement, requiring a high Signal-to-Noise Ratio (SNR) for both the generator and the verifier. [3]
AI slop event at parameter θ AI slop mass and slop regime The paper provides a framework for understanding the stability of self-improvement in AI systems, highlighting the importance of high SNR for both generators and verifiers. [3]
The paper defines AI slop as a region where the internal Verifier ranks outputs among its top fraction, but they actually lie in the bottom fraction of the true battery score. [2]
The paper introduces the Generalized Verifier-Generator Update (GVU) framework, which models the interaction between a generator and its verifier. [1]

AI and Society

Artificial Intelligence / Human Intelligence: Who Controls Whom?

Ecole normale suprieure

Rate paper: 👍 👎 ♥ Save

Abstract
Using the example of the film 2001: A Space Odyssey, this chapter illustrates the challenges posed by an AI capable of making decisions that go against human interests. But are human decisions always rational and ethical? In reality, the cognitive decision-making process is influenced by cognitive biases that affect our behavior and choices. AI not only reproduces these biases, but can also exploit them, with the potential to shape our decisions and judgments. Behind IA algorithms, there are sometimes individuals who show little concern for fundamental rights and impose their own rules. To address the ethical and societal challenges raised by AI and its governance, the regulation of digital platforms and education are keys levers. Regulation must reflect ethical, legal, and political choices, while education must strengthen digital literacy and teach people to make informed and critical choices when facing digital technologies.

The dual footprint of artificial intelligence: environmental and social impacts across the globe

Polytechnic Institute of

Rate paper: 👍 👎 ♥ Save

Abstract
This article introduces the concept of the 'dual footprint' as a heuristic device to capture the commonalities and interdependencies between the different impacts of artificial intelligence (AI) on the natural and social surroundings that supply resources for its production and use. Two in-depth case studies, each illustrating international flows of raw materials and of data work services, portray the AI industry as a value chain that spans national boundaries and perpetuates inherited global inequalities. The countries that drive AI development generate a massive demand for inputs and trigger social costs that, through the value chain, largely fall on more peripheral actors. The arrangements in place distribute the costs and benefits of AI unequally, resulting in unsustainable practices and preventing the upward mobility of more disadvantaged countries. The dual footprint grasps how the environmental and social dimensions of the dual footprint emanate from similar underlying socioeconomic processes and geographical trajectories.

AI Summary

The carbon (and water) footprints of data centre functioning, model training, and inference mainly occur in countries that lead AI development, such as the United States and France. [3]
The supply of data work for countries like the United States and France comes from areas with lower labour costs, including middle- and lower-income countries like Argentina and Madagascar. [3]
The 'dual' nature of the footprint is illuminated by the fact that the same country exports both mining products and data work services, with imports flowing towards countries leading the worldwide AI race. [3]
AI value chain: The series of activities involved in developing and deploying artificial intelligence systems, from raw materials extraction to software development and deployment. [3]
Carbon footprint: The amount of greenhouse gas emissions associated with a particular activity or product. [3]
The analysis takes a step back from stricter interpretations of the footprint concept as an accounting method and instead focuses on a bird's eye view, revealing who is impacted by pressure on resources and related effects spread along the AI value chain. [2]

Research Automation with AI

A Hierarchical Tree-based approach for creating Configurable and Static Deep Research Agent (Static-DRA)

Rate paper: 👍 👎 ♥ Save

Rate image: 👍 👎

Abstract
The advancement in Large Language Models has driven the creation of complex agentic systems, such as Deep Research Agents (DRAs), to overcome the limitations of static Retrieval Augmented Generation (RAG) pipelines in handling complex, multi-turn research tasks. This paper introduces the Static Deep Research Agent (Static-DRA), a novel solution built upon a configurable and hierarchical Tree-based static workflow. The core contribution is the integration of two user-tunable parameters, Depth and Breadth, which provide granular control over the research intensity. This design allows end-users to consciously balance the desired quality and comprehensiveness of the research report against the associated computational cost of Large Language Model (LLM) interactions. The agent's architecture, comprising Supervisor, Independent, and Worker agents, facilitates effective multi-hop information retrieval and parallel sub-topic investigation. We evaluate the Static-DRA against the established DeepResearch Bench using the RACE (Reference-based Adaptive Criteria-driven Evaluation) framework. Configured with a depth of 2 and a breadth of 5, and powered by the gemini-2.5-pro model, the agent achieved an overall score of 34.72. Our experiments validate that increasing the configured Depth and Breadth parameters results in a more in-depth research process and a correspondingly higher evaluation score. The Static-DRA offers a pragmatic and resource-aware solution, empowering users with transparent control over the deep research process. The entire source code, outputs and benchmark results are open-sourced at https://github.com/SauravP97/Static-Deep-Research/

The future of AI in critical mineral exploration

Stanford University

Rate paper: 👍 👎 ♥ Save

Abstract
The energy transition through increased electrification has put the worlds attention on critical mineral exploration Even with increased investments a decrease in new discoveries has taken place over the last two decades Here I propose a solution to this problem where AI is implemented as the enabler of a rigorous scientific method for mineral exploration that aims to reduce cognitive bias and false positives drive down the cost of exploration I propose a new scientific method that is based on a philosophical approach founded on the principles of Bayesianism and falsification In this approach data acquisition is in the first place seen as a means to falsify human generated hypothesis Decision of what data to acquire next is quantified with verifiable metrics and based on rational decision making A practical protocol is provided that can be used as a template in any exploration campaign However in order to make this protocol practical various form of artificial intelligence are needed I will argue that the most important form are one novel unsupervised learning methods that collaborate with domain experts to better understand data and generate multiple competing geological hypotheses and two humanintheloop AI algorithms that can optimally plan various geological geophysical geochemical and drilling data acquisition where uncertainty reduction of geological hypothesis precedes the uncertainty reduction on grade and tonnage

AI Summary

Efficacy of information (EI): a metric that quantifies how much future data will reduce uncertainty on average on some quantity of interest. [3]
The author advocates for a new scientific method for mineral exploration, focusing on decision-making rather than traditional geophysical inversion. [2]
Epistemic uncertainty: the lack of understanding we still have about the nature of orebodies. [1]

AGI: Artificial General Intelligence

The Geometry of Benchmarks: A New Path Toward AGI

ulamai

Rate paper: 👍 👎 ♥ Save

Abstract
Benchmarks are the primary tool for assessing progress in artificial intelligence (AI), yet current practice evaluates models on isolated test suites and provides little guidance for reasoning about generality or autonomous self-improvement. Here we introduce a geometric framework in which all psychometric batteries for AI agents are treated as points in a structured moduli space, and agent performance is described by capability functionals over this space. First, we define an Autonomous AI (AAI) Scale, a Kardashev-style hierarchy of autonomy grounded in measurable performance on batteries spanning families of tasks (for example reasoning, planning, tool use and long-horizon control). Second, we construct a moduli space of batteries, identifying equivalence classes of benchmarks that are indistinguishable at the level of agent orderings and capability inferences. This geometry yields determinacy results: dense families of batteries suffice to certify performance on entire regions of task space. Third, we introduce a general Generator-Verifier-Updater (GVU) operator that subsumes reinforcement learning, self-play, debate and verifier-based fine-tuning as special cases, and we define a self-improvement coefficient $κ$ as the Lie derivative of a capability functional along the induced flow. A variance inequality on the combined noise of generation and verification provides sufficient conditions for $κ> 0$. Our results suggest that progress toward artificial general intelligence (AGI) is best understood as a flow on moduli of benchmarks, driven by GVU dynamics rather than by scores on individual leaderboards.

AI Summary

GVU Dynamics: a formalism that connects static geometry to learning, showing that many contemporary training procedures are special cases of reinforcement learning on the moduli space. [3]
Self-Improvement Coefficient κ: a measure of the rate of change of an agent's capability trajectory over time. [3]
autonomous AI scale moduli space of batteries GVU dynamics self-improvement coefficient κ variance inequality Autonomous AI Scale: a framework for evaluating autonomous AI systems based on performance thresholds on families of batteries. [2]

Deep Learning

Sparse Computations in Deep Learning Inference

National Technical Univer

Rate paper: 👍 👎 ♥ Save

Abstract
The computational demands of modern Deep Neural Networks (DNNs) are immense and constantly growing. While training costs usually capture public attention, inference demands are also contributing in significant computational, energy and environmental footprints. Sparsity stands out as a critical mechanism for drastically reducing these resource demands. However, its potential remains largely untapped and is not yet fully incorporated in production AI systems. To bridge this gap, this work provides the necessary knowledge and insights for performance engineers keen to get involved in deep learning inference optimization. In particular, in this work we: a) discuss the various forms of sparsity that can be utilized in DNN inference, b) explain how the original dense computations translate to sparse kernels, c) provide an extensive bibliographic review of the state-of-the-art in the implementation of these kernels for CPUs and GPUs, d) discuss the availability of sparse datasets in support of sparsity-related research and development, e) explore the current software tools and frameworks that provide robust sparsity support, and f) present evaluation results of different implementations of the key SpMM and SDDMM kernels on CPU and GPU platforms. Ultimately, this paper aims to serve as a resource for performance engineers seeking to develop and deploy highly efficient sparse deep learning models in productions.

AI Summary

The text discusses various aspects of deep learning, including model architecture, training, optimization, and inference. [3]
Model Training: The process that makes a DNN learn to perform a specific task, much like a student learns from practice and correction. [3]
Batch Training: Instead of feeding individual data points one by one, models are trained on small groups of samples called batches. [3]
Training often requires many epochs to fully learn the data’s patterns. [3]
The text concludes that deep learning involves various steps from model architecture to inference, and optimization is crucial for efficient deployment of DNNs. [3]
The text mentions several deep learning frameworks such as PyTorch, TensorFlow, JAX, and Hugging Face Hub. [3]
Deep learning involves various steps from model architecture to inference, and optimization is crucial for efficient deployment of DNNs. [3]
But, just like how you need to practice and get better at recognizing cats, the computer needs to be trained and optimized so that it can perform well in real-world situations. [3]
Epochs: A single pass through the entire dataset is called an epoch. [2]
The text does not provide a clear explanation of the differences between various model representations such as ONNX, TorchScript, TensorFlow SavedModel / GraphDef, etc. [1]

Weight Space Representation Learning with Neural Fields

EPFL

Rate paper: 👍 👎 ♥ Save

Abstract
In this work, we investigate the potential of weights to serve as effective representations, focusing on neural fields. Our key insight is that constraining the optimization space through a pre-trained base model and low-rank adaptation (LoRA) can induce structure in weight space. Across reconstruction, generation, and analysis tasks on 2D and 3D data, we find that multiplicative LoRA weights achieve high representation quality while exhibiting distinctiveness and semantic structure. When used with latent diffusion models, multiplicative LoRA weights enable higher-quality generation than existing weight-space methods.

Interests not found

Help us improve your experience!