Hi!

Your personalized paper recommendations for 08 to 12 December, 2025.
🎯 Top Personalized Recommendations
University of Manchester
Rate paper: 👍 👎 ♥ Save
Paper visualization
Rate image: 👍 👎
AI Summary
  • It introduces different scenarios (1A, 1B, 2) with distinct data structures and constraints. [3]
  • Cn: set of admissible measures over Y, C(N1)n: set of empirical measures with at most N1 support points, C(N1,N2)n: set of empirical measures with at most N1 support points and all probabilities being a integer multiple of 1/N2. [3]
  • ˆV(n)max: value obtained by sample optimization, V(n)max: object of interest in the population. [3]
  • The text discusses various optimization problems related to inequality indices and their asymptotic behavior. [2]
Abstract
We develop a unified, nonparametric framework for sharp partial identification and inference on inequality indices when income or wealth are only coarsely observed -- for example via grouped tables or individual interval reports -- possibly together with linear restrictions such as known means or subgroup totals. First, for a broad class of Schur-convex inequality measures, we characterize extremal allocations and show that sharp bounds are attained by distributions with simple, finite support, reducing the underlying infinite-dimensional problem to finite-dimensional optimization. Second, for indices that admit linear-fractional representations after suitable ordering of the data (including the Gini coefficient, quantile ratios, and the Hoover index), we recast the bound problems as linear or quadratic programs, yielding fast computation of numerically sharp bounds. Third, we establish $\sqrt{n}$ inference for bound endpoints using a uniform directional delta method and a bootstrap procedure for standard errors. In ELSA wealth data with mixed point and interval observations, we obtain sharp Gini bounds of 0.714--0.792 for liquid savings and 0.686--0.767 for a broad savings measure; historical U.S. income tables deliver time-series bounds for the Gini, quantile ratios, and Hoover index under grouped information.
Why we think this paper is great for you:
This paper directly addresses inequality, a core interest. It provides a framework for analyzing incomplete data, which is crucial for understanding disparities.
Duke
Rate paper: 👍 👎 ♥ Save
Abstract
Hoeffding's Inequality provides the maximum probability that a series of n draws from a bounded random variable differ from the variable's true expectation u by more than given tolerance t. The random variable is typically the error rate of a classifier in machine learning applications. Here, a trading strategy is premised on the assumption of an underlying distribution of causal factors, in other words, a market regime, and the random variable is the performance of that trading strategy. A larger deviation of observed performance from the trader's expectation u can be characterized as a lower probability that the financial regime supporting that strategy remains in force, and a higher probability of financial regime change. The changing Hoeffding probabilities can be used as an early warning indicator of this change.
Why we think this paper is great for you:
The paper's focus on financial regime changes and Hoeffding's inequality relates to understanding economic shifts that often exacerbate inequality.
Ho Chi Minh City Universt
Rate paper: 👍 👎 ♥ Save
Paper visualization
Rate image: 👍 👎
AI Summary
  • Mathematical analysis and numerical methods are used to simulate examples on the correlation between minimizing cost, increasing cooperation, and maximizing social welfare. [3]
  • Agent-based simulation is conducted on a square lattice structured network of population to observe and analyze the difference in the θ value for both global and local interference strategies. [3]
  • Well-mixed populations: Populations where individuals interact with each other randomly. [3]
  • The study concludes that optimizing intervention cost does not necessarily lead to maximizing social welfare, and there exists a gap between them. [3]
  • The findings have implications for understanding the evolution of cooperation in populations and designing effective interventions. [3]
  • The study explores the relationship between optimizing intervention cost and social welfare in well-mixed and structured populations under external institutional investment. [2]
Abstract
Research on promoting cooperation among autonomous, self-regarding agents has often focused on the bi-objective optimisation problem: minimising the total incentive cost while maximising the frequency of cooperation. However, the optimal value of social welfare under such constraints remains largely unexplored. In this work, we hypothesise that achieving maximal social welfare is not guaranteed at the minimal incentive cost required to drive agents to a desired cooperative state. To address this gap, we adopt to a single-objective approach focused on maximising social welfare, building upon foundational evolutionary game theory models that examined cost efficiency in finite populations, in both well-mixed and structured population settings. Our analytical model and agent-based simulations show how different interference strategies, including rewarding local versus global behavioural patterns, affect social welfare and dynamics of cooperation. Our results reveal a significant gap in the per-individual incentive cost between optimising for pure cost efficiency or cooperation frequency and optimising for maximal social welfare. Overall, our findings indicate that incentive design, policy, and benchmarking in multi-agent systems and human societies should prioritise welfare-centric objectives over proxy targets of cost or cooperation frequency.
Why we think this paper is great for you:
This research explores social welfare optimization, a key area for examining the impact of inequality and promoting cooperative solutions.
University of California
Rate paper: 👍 👎 ♥ Save
Abstract
We prove a quantitative version of Zhang's fundamental inequality for heights attached to polarizable endomorphisms. As an application, we obtain a gap principle for the Néron-Tate height on abelian varieties over function fields of arbitrary transcendence degree and characteristic zero, extending the result of Gao-Ge-Kühne. We also establish instances of effective gap principles for regular polynomial endomorphisms of $\mathbb{P}^2$, in the sense that all constants can are explicit. These yield effective instances of uniformity in the dynamical Bogomolov conjecture in both the arithmetic and geometric settings, including examples in prime characteristic.
Why we think this paper is great for you:
The paper's focus on height and inequalities is relevant to understanding the distribution of resources and wealth, a central concern for this user's interests.
Halmstad University
Rate paper: 👍 👎 ♥ Save
AI Summary
  • Multi-agent systems exchange a single 'do-everything' agent for a team of specialised agents that co-operate (or compete) under explicit protocols. [3]
  • Planning- and self-improvement agents: A class of AI systems that use search and optimization techniques to solve complex problems. [3]
  • Embodied and web agents: AI systems that act in the world, either physically (embodied) or through interactions with untrusted websites and enterprise systems (web). [3]
  • Planning- and self-improvement agents can be prone to state explosion, speculative arithmetic errors, and over-confident selection. [3]
  • Planning- and self-improvement agents deliver substantial reliability dividends when their power is channelled through explicit controllers, trustworthy verifiers, and disciplined governance of cost and side-effects. [2]
Abstract
This chapter argues that the reliability of agentic and generative AI is chiefly an architectural property. We define agentic systems as goal-directed, tool-using decision makers operating in closed loops, and show how reliability emerges from principled componentisation (goal manager, planner, tool-router, executor, memory, verifiers, safety monitor, telemetry), disciplined interfaces (schema-constrained, validated, least-privilege tool calls), and explicit control and assurance loops. Building on classical foundations, we propose a practical taxonomy-tool-using agents, memory-augmented agents, planning and self-improvement agents, multi-agent systems, and embodied or web agents - and analyse how each pattern reshapes the reliability envelope and failure modes. We distil design guidance on typed schemas, idempotency, permissioning, transactional semantics, memory provenance and hygiene, runtime governance (budgets, termination conditions), and simulate-before-actuate safeguards.
Why we think this paper is great for you:
The exploration of agentic AI architectures aligns with an interest in systems that can be designed to address and potentially mitigate inequality.
Perplexity
Rate paper: 👍 👎 ♥ Save
Paper visualization
Rate image: 👍 👎
AI Summary
  • The agent is used primarily for productivity-related tasks (36% of all queries), followed by learning, media, and shopping. [3]
  • Research, document editing, and shopping-related tasks appear consistently across occupation clusters. [3]
  • Knowledge-intensive sectors like digital technology, entrepreneurship, finance, and academia tend to use the agent for research and learning-related tasks. [3]
  • Productivity and learning topics are the most sticky, while travel is the least sticky. [2]
  • Users' first queries often fall into productivity, learning, or media topics, but over time, there's a shift towards more cognitively oriented use cases. [1]
Abstract
This paper presents the first large-scale field study of the adoption, usage intensity, and use cases of general-purpose AI agents operating in open-world web environments. Our analysis centers on Comet, an AI-powered browser developed by Perplexity, and its integrated agent, Comet Assistant. Drawing on hundreds of millions of anonymized user interactions, we address three fundamental questions: Who is using AI agents? How intensively are they using them? And what are they using them for? Our findings reveal substantial heterogeneity in adoption and usage across user segments. Earlier adopters, users in countries with higher GDP per capita and educational attainment, and individuals working in digital or knowledge-intensive sectors -- such as digital technology, academia, finance, marketing, and entrepreneurship -- are more likely to adopt or actively use the agent. To systematically characterize the substance of agent usage, we introduce a hierarchical agentic taxonomy that organizes use cases across three levels: topic, subtopic, and task. The two largest topics, Productivity & Workflow and Learning & Research, account for 57% of all agentic queries, while the two largest subtopics, Courses and Shopping for Goods, make up 22%. The top 10 out of 90 tasks represent 55% of queries. Personal use constitutes 55% of queries, while professional and educational contexts comprise 30% and 16%, respectively. In the short term, use cases exhibit strong stickiness, but over time users tend to shift toward more cognitively oriented topics. The diffusion of increasingly capable AI agents carries important implications for researchers, businesses, policymakers, and educators, inviting new lines of inquiry into this rapidly emerging class of AI capabilities.
Why we think this paper is great for you:
This paper investigates AI agents in real-world environments, offering insights into how these systems might interact with and potentially influence economic dynamics.
Northeastern University
Rate paper: 👍 👎 ♥ Save
Paper visualization
Rate image: 👍 👎
AI Summary
  • RAMTN系统是一种基于元交互的人机协作认知增强范式,旨在通过提取专家决策框架来实现智能辅助和知识共享。 该系统的核心思想是将人类专家的认知过程与计算机系统的信息处理能力结合起来,从而实现高效的决策支持和知识推理。 RAMTN系统的应用领域包括投资、医疗和教育等多个领域,旨在通过提取专家决策框架来提高决策准确性和效率。 元交互(Meta-Interaction):一种将人类认知过程与计算机系统信息处理能力结合起来的技术,旨在实现高效的决策支持和知识推理。 人机协作认知增强范式(Human-Machine Collaborative Cognition Enhancement Paradigm):一种基于元交互的框架,旨在通过提取专家决策框架来实现智能辅助和知识共享。 RAMTN系统是一种创新性的解决方案,旨在通过提取专家决策框架来提高决策准确性和效率。 该系统的应用领域包括投资、医疗和教育等多个领域,具有广泛的潜力和前景。 该系统的开发和应用依赖于大量的数据和信息资源,可能存在数据质量和可靠性的问题。 该系统的安全性和隐私保护需要进一步研究和解决。 元交互技术在决策支持和知识推理领域有广泛的应用和研究。 [3]
Abstract
Currently, there exists a fundamental divide between the "cognitive black box" (implicit intuition) of human experts and the "computational black box" (untrustworthy decision-making) of artificial intelligence (AI). This paper proposes a new paradigm of "human-AI collaborative cognitive enhancement," aiming to transform the dual black boxes into a composable, auditable, and extensible "functional white-box" system through structured "meta-interaction." The core breakthrough lies in the "plug-and-play cognitive framework"--a computable knowledge package that can be extracted from expert dialogues and loaded into the Recursive Adversarial Meta-Thinking Network (RAMTN). This enables expert thinking, such as medical diagnostic logic and teaching intuition, to be converted into reusable and scalable public assets, realizing a paradigm shift from "AI as a tool" to "AI as a thinking partner." This work not only provides the first engineering proof for "cognitive equity" but also opens up a new path for AI governance: constructing a verifiable and intervenable governance paradigm through "transparency of interaction protocols" rather than prying into the internal mechanisms of models. The framework is open-sourced to promote technology for good and cognitive inclusion. This paper is an independent exploratory research conducted by the author. All content presented, including the theoretical framework (RAMTN), methodology (meta-interaction), system implementation, and case validation, constitutes the author's individual research achievements.
Why we think this paper is great for you:
The focus on human-AI collaboration and addressing the 'black box' problem is pertinent to understanding how to build more transparent and accountable systems related to inequality.
Research Automation with AI
Peking University
Rate paper: 👍 👎 ♥ Save
Abstract
We investigate how large language models can be used as research tools in scientific computing while preserving mathematical rigor. We propose a human-in-the-loop workflow for interactive theorem proving and discovery with LLMs. Human experts retain control over problem formulation and admissible assumptions, while the model searches for proofs or contradictions, proposes candidate properties and theorems, and helps construct structures and parameters that satisfy explicit constraints, supported by numerical experiments and simple verification checks. Experts treat these outputs as raw material, further refine them, and organize the results into precise statements and rigorous proofs. We instantiate this workflow in a case study on the connection between manifold optimization and Grover's quantum search algorithm, where the pipeline helps identify invariant subspaces, explore Grover-compatible retractions, and obtain convergence guarantees for the retraction-based gradient method. The framework provides a practical template for integrating large language models into frontier mathematical research, enabling faster exploration of proof space and algorithm design while maintaining transparent reasoning responsibilities. Although illustrated on manifold optimization problems in quantum computing, the principles extend to other core areas of scientific computing.
AI Summary
  • Previous research has shown that human-AI collaboration can improve performance in various tasks, including theorem discovery and proof verification. [3]
  • The collaboration between human experts and an LLM is organized into three stages, starting from an informal conjecture and ending with a precise theorem and proof. [2]
  • Human-AI collaboration can significantly improve mathematical proof and theorem discovery. [1]
German Cancer Research
Rate paper: 👍 👎 ♥ Save
Abstract
Developing generalizable AI for medical imaging requires both access to large, multi-center datasets and standardized, reproducible tooling within research environments. However, leveraging real-world imaging data in clinical research environments is still hampered by strict regulatory constraints, fragmented software infrastructure, and the challenges inherent in conducting large-cohort multicentre studies. This leads to projects that rely on ad-hoc toolchains that are hard to reproduce, difficult to scale beyond single institutions and poorly suited for collaboration between clinicians and data scientists. We present Kaapana, a comprehensive open-source platform for medical imaging research that is designed to bridge this gap. Rather than building single-use, site-specific tooling, Kaapana provides a modular, extensible framework that unifies data ingestion, cohort curation, processing workflows and result inspection under a common user interface. By bringing the algorithm to the data, it enables institutions to keep control over their sensitive data while still participating in distributed experimentation and model development. By integrating flexible workflow orchestration with user-facing applications for researchers, Kaapana reduces technical overhead, improves reproducibility and enables conducting large-scale, collaborative, multi-centre imaging studies. We describe the core concepts of the platform and illustrate how they can support diverse use cases, from local prototyping to nation-wide research networks. The open-source codebase is available at https://github.com/kaapana/kaapana
AGI: Artificial General Intelligence
Meta
Rate paper: 👍 👎 ♥ Save
Abstract
Real-world AI software engineering demands coding agents that can reason over massive repositories, maintain durable memory across and within long sessions, and robustly coordinate complex toolchains at test time. Existing open-source coding agents provide transparency but frequently fall short when pushed to these industrial-scale workloads, while proprietary coding agents offer strong practical performance but limited extensibility, interpretability, and controllability. We present the Confucius Code Agent (CCA), an open-sourced AI software engineer that can operate at an industrial scale. CCA is built atop the Confucius SDK, an open-sourced agent development platform designed around three complementary perspectives: Agent Experience (AX), User Experience (UX), and Developer Experience (DX). The SDK introduces a unified orchestrator with hierarchical working memory for long-context reasoning, a persistent note-taking system for cross-session continual learning, and a modular extension module for robust tool use. Moreover, a meta-agent automates the synthesis, evaluation, and refinement of agent configurations through a build-test-improve loop, enabling rapid agent development on new tasks, environments, and tool stacks. Instantiated on Confucius SDK with these mechanisms, CCA delivers strong performance on real-world software engineering tasks. On SWE-Bench-Pro, CCA achieves a state-of-the-art Resolve@1 performance of 54.3%, substantially improving over prior coding agents. Together, the Confucius SDK and CCA provide a transparent, extensible, and reproducible foundation for AI agents, bridge gaps between research prototypes and production-grade systems, and support agent development and deployment at industrial scale.
AI Summary
  • SWE-Bench: a comprehensive benchmark to evaluate autonomous code-writing and code-fixing agents on realistic tasks. [3]
  • The combination of monorepo development and LLM-based tools like ECO underscores a trend toward holistic scale: treating an entire organization’s code as a single evolvable system, with AI agents providing the intelligence to manage global changes, dependency analysis, and performance tuning in ways humans alone could not easily scale. [2]
  • Large-scale software engineering has driven interest in AI assistance for code discovery, understanding, and consistent changes at scale. [1]
Rate paper: 👍 👎 ♥ Save
Abstract
Foundation models (FMs) are increasingly assuming the role of the "brain" of AI agents. While recent efforts have begun to equip FMs with native single-agent abilities -- such as GUI interaction or integrated tool use -- we argue that the next frontier is endowing FMs with native multi-agent intelligence. We identify four core capabilities of FMs in multi-agent contexts: understanding, planning, efficient communication, and adaptation. Contrary to assumptions about the spontaneous emergence of such abilities, we provide extensive empirical evidence across 41 large language models showing that strong single-agent performance alone does not automatically yield robust multi-agent intelligence. To address this gap, we outline key research directions -- spanning dataset construction, evaluation, training paradigms, and safety considerations -- for building FMs with native multi-agent intelligence.
Deep Learning
Universidad de Guanajuato
Rate paper: 👍 👎 ♥ Save
Paper visualization
Rate image: 👍 👎
Abstract
This document reports the sequence of practices and methodologies implemented during the Big Data course. It details the workflow beginning with the processing of the Epsilon dataset through group and individual strategies, followed by text analysis and classification with RestMex and movie feature analysis with IMDb. Finally, it describes the technical implementation of a distributed computing cluster with Apache Spark on Linux using Scala.
AI Summary
  • In the big data era, data completeness can be as important as algorithm sophistication. [3]
  • Big Data Analytics Distributed Computing Scalability Algorithm Sophistication Data Completeness The chronological progression demonstrates that mastering big data requires a systematic approach. [3]
  • The choice between local and distributed architectures is not merely about computational resources, but about the quality and completeness of the data available to the model. [2]
National University of
Rate paper: 👍 👎 ♥ Save
Abstract
Accurate forecasting of urban air pollution is essential for protecting public health and guiding mitigation policies. While Deep Learning (DL) and hybrid pipelines dominate recent research, their complexity and limited interpretability hinder operational use. This study investigates whether lightweight additive models -- Facebook Prophet (FBP) and NeuralProphet (NP) -- can deliver competitive forecasts for particulate matter (PM$_{2.5}$, PM$_{10}$) in Beijing, China. Using multi-year pollutant and meteorological data, we applied systematic feature selection (correlation, mutual information, mRMR), leakage-safe scaling, and chronological data splits. Both models were trained with pollutant and precursor regressors, with NP additionally leveraging lagged dependencies. For context, two machine learning baselines (LSTM, LightGBM) and one traditional statistical model (SARIMAX) were also implemented. Performance was evaluated on a 7-day holdout using MAE, RMSE, and $R^2$. Results show that FBP consistently outperformed NP, SARIMAX, and the learning-based baselines, achieving test $R^2$ above 0.94 for both pollutants. These findings demonstrate that interpretable additive models remain competitive with both traditional and complex approaches, offering a practical balance of accuracy, transparency, and ease of deployment.
AI Summary
  • The study also explores the impact of different input features on the performance of the models and finds that using both air quality index and weather data improves the predictive power of the models. [3]
  • AQI: Air Quality Index MAE: Mean Absolute Error The study demonstrates the effectiveness of machine learning models in predicting AQIs and highlights the importance of using both air quality index and weather data for improved predictive power. [3]
  • The results of this study can be used to inform policy decisions related to air pollution control and mitigation strategies. [3]
  • The study only evaluates the performance of different models on a single dataset and does not explore the generalizability of the results to other locations or datasets. [3]
  • The authors do not provide any discussion on the limitations of the study, such as the potential impact of data quality issues or the lack of consideration for non-linear relationships between input features. [3]
  • The paper presents a comparative study of various machine learning models for predicting air quality indices (AQIs) in Beijing, China. [2]
  • The results show that the Prophet model outperforms other models in terms of accuracy, with a mean absolute error (MAE) of 4.35 μg/m³. [1]

We did not find tons of content matching your interests we've included some additional topics that are popular. Also be aware that if the topics is not present in arxiv we wont be able to recommend it.

AI Agents
Halmstad University
Rate paper: 👍 👎 ♥ Save
Abstract
This chapter argues that the reliability of agentic and generative AI is chiefly an architectural property. We define agentic systems as goal-directed, tool-using decision makers operating in closed loops, and show how reliability emerges from principled componentisation (goal manager, planner, tool-router, executor, memory, verifiers, safety monitor, telemetry), disciplined interfaces (schema-constrained, validated, least-privilege tool calls), and explicit control and assurance loops. Building on classical foundations, we propose a practical taxonomy-tool-using agents, memory-augmented agents, planning and self-improvement agents, multi-agent systems, and embodied or web agents - and analyse how each pattern reshapes the reliability envelope and failure modes. We distil design guidance on typed schemas, idempotency, permissioning, transactional semantics, memory provenance and hygiene, runtime governance (budgets, termination conditions), and simulate-before-actuate safeguards.
AI Summary
  • Multi-agent systems exchange a single 'do-everything' agent for a team of specialised agents that co-operate (or compete) under explicit protocols. [3]
  • Planning- and self-improvement agents: A class of AI systems that use search and optimization techniques to solve complex problems. [3]
  • Embodied and web agents: AI systems that act in the world, either physically (embodied) or through interactions with untrusted websites and enterprise systems (web). [3]
  • Planning- and self-improvement agents can be prone to state explosion, speculative arithmetic errors, and over-confident selection. [3]
  • Planning- and self-improvement agents deliver substantial reliability dividends when their power is channelled through explicit controllers, trustworthy verifiers, and disciplined governance of cost and side-effects. [2]
Perplexity
Rate paper: 👍 👎 ♥ Save
Abstract
This paper presents the first large-scale field study of the adoption, usage intensity, and use cases of general-purpose AI agents operating in open-world web environments. Our analysis centers on Comet, an AI-powered browser developed by Perplexity, and its integrated agent, Comet Assistant. Drawing on hundreds of millions of anonymized user interactions, we address three fundamental questions: Who is using AI agents? How intensively are they using them? And what are they using them for? Our findings reveal substantial heterogeneity in adoption and usage across user segments. Earlier adopters, users in countries with higher GDP per capita and educational attainment, and individuals working in digital or knowledge-intensive sectors -- such as digital technology, academia, finance, marketing, and entrepreneurship -- are more likely to adopt or actively use the agent. To systematically characterize the substance of agent usage, we introduce a hierarchical agentic taxonomy that organizes use cases across three levels: topic, subtopic, and task. The two largest topics, Productivity & Workflow and Learning & Research, account for 57% of all agentic queries, while the two largest subtopics, Courses and Shopping for Goods, make up 22%. The top 10 out of 90 tasks represent 55% of queries. Personal use constitutes 55% of queries, while professional and educational contexts comprise 30% and 16%, respectively. In the short term, use cases exhibit strong stickiness, but over time users tend to shift toward more cognitively oriented topics. The diffusion of increasingly capable AI agents carries important implications for researchers, businesses, policymakers, and educators, inviting new lines of inquiry into this rapidly emerging class of AI capabilities.
AI Summary
  • The agent is used primarily for productivity-related tasks (36% of all queries), followed by learning, media, and shopping. [3]
  • Research, document editing, and shopping-related tasks appear consistently across occupation clusters. [3]
  • Knowledge-intensive sectors like digital technology, entrepreneurship, finance, and academia tend to use the agent for research and learning-related tasks. [3]
  • Productivity and learning topics are the most sticky, while travel is the least sticky. [2]
  • Users' first queries often fall into productivity, learning, or media topics, but over time, there's a shift towards more cognitively oriented use cases. [1]
AI and Society
Northeastern University
Rate paper: 👍 👎 ♥ Save
Abstract
Currently, there exists a fundamental divide between the "cognitive black box" (implicit intuition) of human experts and the "computational black box" (untrustworthy decision-making) of artificial intelligence (AI). This paper proposes a new paradigm of "human-AI collaborative cognitive enhancement," aiming to transform the dual black boxes into a composable, auditable, and extensible "functional white-box" system through structured "meta-interaction." The core breakthrough lies in the "plug-and-play cognitive framework"--a computable knowledge package that can be extracted from expert dialogues and loaded into the Recursive Adversarial Meta-Thinking Network (RAMTN). This enables expert thinking, such as medical diagnostic logic and teaching intuition, to be converted into reusable and scalable public assets, realizing a paradigm shift from "AI as a tool" to "AI as a thinking partner." This work not only provides the first engineering proof for "cognitive equity" but also opens up a new path for AI governance: constructing a verifiable and intervenable governance paradigm through "transparency of interaction protocols" rather than prying into the internal mechanisms of models. The framework is open-sourced to promote technology for good and cognitive inclusion. This paper is an independent exploratory research conducted by the author. All content presented, including the theoretical framework (RAMTN), methodology (meta-interaction), system implementation, and case validation, constitutes the author's individual research achievements.
AI Summary
  • RAMTN系统是一种基于元交互的人机协作认知增强范式,旨在通过提取专家决策框架来实现智能辅助和知识共享。 该系统的核心思想是将人类专家的认知过程与计算机系统的信息处理能力结合起来,从而实现高效的决策支持和知识推理。 RAMTN系统的应用领域包括投资、医疗和教育等多个领域,旨在通过提取专家决策框架来提高决策准确性和效率。 元交互(Meta-Interaction):一种将人类认知过程与计算机系统信息处理能力结合起来的技术,旨在实现高效的决策支持和知识推理。 人机协作认知增强范式(Human-Machine Collaborative Cognition Enhancement Paradigm):一种基于元交互的框架,旨在通过提取专家决策框架来实现智能辅助和知识共享。 RAMTN系统是一种创新性的解决方案,旨在通过提取专家决策框架来提高决策准确性和效率。 该系统的应用领域包括投资、医疗和教育等多个领域,具有广泛的潜力和前景。 该系统的开发和应用依赖于大量的数据和信息资源,可能存在数据质量和可靠性的问题。 该系统的安全性和隐私保护需要进一步研究和解决。 元交互技术在决策支持和知识推理领域有广泛的应用和研究。 [3]
Research Automation with AI
Peking University
Rate paper: 👍 👎 ♥ Save
Abstract
We investigate how large language models can be used as research tools in scientific computing while preserving mathematical rigor. We propose a human-in-the-loop workflow for interactive theorem proving and discovery with LLMs. Human experts retain control over problem formulation and admissible assumptions, while the model searches for proofs or contradictions, proposes candidate properties and theorems, and helps construct structures and parameters that satisfy explicit constraints, supported by numerical experiments and simple verification checks. Experts treat these outputs as raw material, further refine them, and organize the results into precise statements and rigorous proofs. We instantiate this workflow in a case study on the connection between manifold optimization and Grover's quantum search algorithm, where the pipeline helps identify invariant subspaces, explore Grover-compatible retractions, and obtain convergence guarantees for the retraction-based gradient method. The framework provides a practical template for integrating large language models into frontier mathematical research, enabling faster exploration of proof space and algorithm design while maintaining transparent reasoning responsibilities. Although illustrated on manifold optimization problems in quantum computing, the principles extend to other core areas of scientific computing.
AI Summary
  • Previous research has shown that human-AI collaboration can improve performance in various tasks, including theorem discovery and proof verification. [3]
  • The collaboration between human experts and an LLM is organized into three stages, starting from an informal conjecture and ending with a precise theorem and proof. [2]
  • Human-AI collaboration can significantly improve mathematical proof and theorem discovery. [1]
German Cancer Research
Rate paper: 👍 👎 ♥ Save
Abstract
Developing generalizable AI for medical imaging requires both access to large, multi-center datasets and standardized, reproducible tooling within research environments. However, leveraging real-world imaging data in clinical research environments is still hampered by strict regulatory constraints, fragmented software infrastructure, and the challenges inherent in conducting large-cohort multicentre studies. This leads to projects that rely on ad-hoc toolchains that are hard to reproduce, difficult to scale beyond single institutions and poorly suited for collaboration between clinicians and data scientists. We present Kaapana, a comprehensive open-source platform for medical imaging research that is designed to bridge this gap. Rather than building single-use, site-specific tooling, Kaapana provides a modular, extensible framework that unifies data ingestion, cohort curation, processing workflows and result inspection under a common user interface. By bringing the algorithm to the data, it enables institutions to keep control over their sensitive data while still participating in distributed experimentation and model development. By integrating flexible workflow orchestration with user-facing applications for researchers, Kaapana reduces technical overhead, improves reproducibility and enables conducting large-scale, collaborative, multi-centre imaging studies. We describe the core concepts of the platform and illustrate how they can support diverse use cases, from local prototyping to nation-wide research networks. The open-source codebase is available at https://github.com/kaapana/kaapana
AGI: Artificial General Intelligence
Meta
Rate paper: 👍 👎 ♥ Save
Abstract
Real-world AI software engineering demands coding agents that can reason over massive repositories, maintain durable memory across and within long sessions, and robustly coordinate complex toolchains at test time. Existing open-source coding agents provide transparency but frequently fall short when pushed to these industrial-scale workloads, while proprietary coding agents offer strong practical performance but limited extensibility, interpretability, and controllability. We present the Confucius Code Agent (CCA), an open-sourced AI software engineer that can operate at an industrial scale. CCA is built atop the Confucius SDK, an open-sourced agent development platform designed around three complementary perspectives: Agent Experience (AX), User Experience (UX), and Developer Experience (DX). The SDK introduces a unified orchestrator with hierarchical working memory for long-context reasoning, a persistent note-taking system for cross-session continual learning, and a modular extension module for robust tool use. Moreover, a meta-agent automates the synthesis, evaluation, and refinement of agent configurations through a build-test-improve loop, enabling rapid agent development on new tasks, environments, and tool stacks. Instantiated on Confucius SDK with these mechanisms, CCA delivers strong performance on real-world software engineering tasks. On SWE-Bench-Pro, CCA achieves a state-of-the-art Resolve@1 performance of 54.3%, substantially improving over prior coding agents. Together, the Confucius SDK and CCA provide a transparent, extensible, and reproducible foundation for AI agents, bridge gaps between research prototypes and production-grade systems, and support agent development and deployment at industrial scale.
AI Summary
  • SWE-Bench: a comprehensive benchmark to evaluate autonomous code-writing and code-fixing agents on realistic tasks. [3]
  • The combination of monorepo development and LLM-based tools like ECO underscores a trend toward holistic scale: treating an entire organization’s code as a single evolvable system, with AI agents providing the intelligence to manage global changes, dependency analysis, and performance tuning in ways humans alone could not easily scale. [2]
  • Large-scale software engineering has driven interest in AI assistance for code discovery, understanding, and consistent changes at scale. [1]
Rate paper: 👍 👎 ♥ Save
Abstract
Foundation models (FMs) are increasingly assuming the role of the "brain" of AI agents. While recent efforts have begun to equip FMs with native single-agent abilities -- such as GUI interaction or integrated tool use -- we argue that the next frontier is endowing FMs with native multi-agent intelligence. We identify four core capabilities of FMs in multi-agent contexts: understanding, planning, efficient communication, and adaptation. Contrary to assumptions about the spontaneous emergence of such abilities, we provide extensive empirical evidence across 41 large language models showing that strong single-agent performance alone does not automatically yield robust multi-agent intelligence. To address this gap, we outline key research directions -- spanning dataset construction, evaluation, training paradigms, and safety considerations -- for building FMs with native multi-agent intelligence.
Deep Learning
Universidad de Guanajuato
Rate paper: 👍 👎 ♥ Save
Abstract
This document reports the sequence of practices and methodologies implemented during the Big Data course. It details the workflow beginning with the processing of the Epsilon dataset through group and individual strategies, followed by text analysis and classification with RestMex and movie feature analysis with IMDb. Finally, it describes the technical implementation of a distributed computing cluster with Apache Spark on Linux using Scala.
AI Summary
  • In the big data era, data completeness can be as important as algorithm sophistication. [3]
  • Big Data Analytics Distributed Computing Scalability Algorithm Sophistication Data Completeness The chronological progression demonstrates that mastering big data requires a systematic approach. [3]
  • The choice between local and distributed architectures is not merely about computational resources, but about the quality and completeness of the data available to the model. [2]
National University of
Rate paper: 👍 👎 ♥ Save
Abstract
Accurate forecasting of urban air pollution is essential for protecting public health and guiding mitigation policies. While Deep Learning (DL) and hybrid pipelines dominate recent research, their complexity and limited interpretability hinder operational use. This study investigates whether lightweight additive models -- Facebook Prophet (FBP) and NeuralProphet (NP) -- can deliver competitive forecasts for particulate matter (PM$_{2.5}$, PM$_{10}$) in Beijing, China. Using multi-year pollutant and meteorological data, we applied systematic feature selection (correlation, mutual information, mRMR), leakage-safe scaling, and chronological data splits. Both models were trained with pollutant and precursor regressors, with NP additionally leveraging lagged dependencies. For context, two machine learning baselines (LSTM, LightGBM) and one traditional statistical model (SARIMAX) were also implemented. Performance was evaluated on a 7-day holdout using MAE, RMSE, and $R^2$. Results show that FBP consistently outperformed NP, SARIMAX, and the learning-based baselines, achieving test $R^2$ above 0.94 for both pollutants. These findings demonstrate that interpretable additive models remain competitive with both traditional and complex approaches, offering a practical balance of accuracy, transparency, and ease of deployment.
AI Summary
  • The study also explores the impact of different input features on the performance of the models and finds that using both air quality index and weather data improves the predictive power of the models. [3]
  • AQI: Air Quality Index MAE: Mean Absolute Error The study demonstrates the effectiveness of machine learning models in predicting AQIs and highlights the importance of using both air quality index and weather data for improved predictive power. [3]
  • The results of this study can be used to inform policy decisions related to air pollution control and mitigation strategies. [3]
  • The study only evaluates the performance of different models on a single dataset and does not explore the generalizability of the results to other locations or datasets. [3]
  • The authors do not provide any discussion on the limitations of the study, such as the potential impact of data quality issues or the lack of consideration for non-linear relationships between input features. [3]
  • The paper presents a comparative study of various machine learning models for predicting air quality indices (AQIs) in Beijing, China. [2]
  • The results show that the Prophet model outperforms other models in terms of accuracy, with a mean absolute error (MAE) of 4.35 μg/m³. [1]

Interests not found

We did not find any papers that match the below interests. Try other terms also consider if the content exists in arxiv.org.
  • Social Inequality
You can edit or add more interests any time.