Hi!

Your personalized paper recommendations for 08 to 12 December, 2025.
🎯 Top Personalized Recommendations
IT University of Denmark
Rate paper: 👍 👎 ♥ Save
AI Summary
  • Garbage collection is particularly important for Rizzo since unused signals can accumulate in the heap. [3]
  • Advance Semantics: The operational semantics that performs the delayed computation of a signal, producing a new value. [3]
  • In practice, a single global heap can be used to store signals, with a linked list structure providing the necessary ordering of signals required for the update semantics. [2]
  • The operational semantics of Rizzo provides a precise account of the operational behavior of the language, allowing for formal statements of the operational guarantees provided by the type system. [1]
Abstract
Functional reactive programming (FRP) is a declarative programming paradigm for implementing reactive programs at a high level of abstraction. It applies functional programming principles to construct and manipulate time-varying values, also known as signals. However, for this programming paradigm to work in practice, an FRP language must ensure that programs are causal, productive, and free from space leaks. Over the past fifteen years, several modal type systems to enforce these operational properties have been developed. We present a new FRP language with a significantly simplified modal type system that imposes fewer restrictions than previous modal FRP languages while still guaranteeing the central operational properties of causality, productivity, and absence of space leaks. The key enabling idea is to alter the semantics of signals so that the type system can safely allow more programs to type-check, which also makes the language more expressive. With this new semantics, signals are modelled as mutable references whose mutability is tightly controlled by the 'later' type modality. This disciplined form of mutability also enables more efficient in-place updates of signals, all while preserving a functional programming style.
Why we think this paper is great for you:
This paper directly addresses functional programming, a core interest, by exploring modal types within functional reactive programming, a paradigm closely aligned with the user’s stated preferences.
Independent Researcher
Rate paper: 👍 👎 ♥ Save
Paper visualization
Rate image: 👍 👎
AI Summary
  • Communicative relations: The relationships between elements that involve data flow and shared resources. [3]
  • The framework consists of two dimensions: Causal-Temporal relations (Sequential, Branch, or Event) and Communicative relations (what flows between elements and what they share). [2]
  • The paper presents a new framework for understanding program comprehension called FLARE v2. [1]
Abstract
Building on the classroom framework reported in Heath et al. (2025), this paper proposes FLARE v2 as a recursive, semiotically informed account of how program meaning is constructed. It reinterprets the descriptive tiers of FLARE v1 as instances of a single generative operation: identify elements (characterised by the four properties Receives, Sends, Effects, Shares); analyse their bindings along two dimensions (Causal-Temporal and Communicative); and recognise the new element that emerges. The Causal-Temporal dimension encompasses three subtypes - Sequential, Branch, and Event - that together account for control flow in both procedural and event-driven environments. A Compositional Ladder provides a visual parallel between literacy progressions and programming structures, illustrating how recursive composition operates from blocks and statements through segments, systems, and services. The framework aims to address conceptual and cognitive-load limitations reported in FLARE v1 and is situated within semiotic and program-comprehension theory. FLARE v2 is presented as a conceptual lens with potential implications for pedagogy and curriculum design; implementation and empirical evaluation are left for future work.
Why we think this paper is great for you:
The focus on program comprehension and understanding across languages is highly relevant to the user’s interest in programming language design and paradigms, offering insights into how systems are understood.
Beijing University of A
Rate paper: 👍 👎 ♥ Save
Paper visualization
Rate image: 👍 👎
Abstract
Large Language Models (LLMs) have demonstrated remarkable performance in code intelligence tasks such as code generation, summarization, and translation. However, their reliance on linearized token sequences limits their ability to understand the structural semantics of programs. While prior studies have explored graphaugmented prompting and structure-aware pretraining, they either suffer from prompt length constraints or require task-specific architectural changes that are incompatible with large-scale instructionfollowing LLMs. To address these limitations, this paper proposes CGBridge, a novel plug-and-play method that enhances LLMs with Code Graph information through an external, trainable Bridge module. CGBridge first pre-trains a code graph encoder via selfsupervised learning on a large-scale dataset of 270K code graphs to learn structural code semantics. It then trains an external module to bridge the modality gap among code, graph, and text by aligning their semantics through cross-modal attention mechanisms. Finally, the bridge module generates structure-informed prompts, which are injected into a frozen LLM, and is fine-tuned for downstream code intelligence tasks. Experiments show that CGBridge achieves notable improvements over both the original model and the graphaugmented prompting method. Specifically, it yields a 16.19% and 9.12% relative gain in LLM-as-a-Judge on code summarization, and a 9.84% and 38.87% relative gain in Execution Accuracy on code translation. Moreover, CGBridge achieves over 4x faster inference than LoRA-tuned models, demonstrating both effectiveness and efficiency in structure-aware code understanding.
Why we think this paper is great for you:
Given the user’s interest in programming languages and design patterns, this paper’s exploration of how LLMs can understand program structure is a promising area of investigation.
Northeastern University
Rate paper: 👍 👎 ♥ Save
Paper visualization
Rate image: 👍 👎
AI Summary
  • RAMTN系统是一种基于元交互的人机协作认知增强范式,旨在通过提取专家决策框架来实现智能辅助和知识共享。 该系统的核心思想是将人类专家的认知过程与计算机系统的信息处理能力结合起来,从而实现高效的决策支持和知识推理。 RAMTN系统的应用领域包括投资、医疗和教育等多个领域,旨在通过提取专家决策框架来提高决策准确性和效率。 元交互(Meta-Interaction):一种将人类认知过程与计算机系统信息处理能力结合起来的技术,旨在实现高效的决策支持和知识推理。 人机协作认知增强范式(Human-Machine Collaborative Cognition Enhancement Paradigm):一种基于元交互的框架,旨在通过提取专家决策框架来实现智能辅助和知识共享。 RAMTN系统是一种创新性的解决方案,旨在通过提取专家决策框架来提高决策准确性和效率。 该系统的应用领域包括投资、医疗和教育等多个领域,具有广泛的潜力和前景。 该系统的开发和应用依赖于大量的数据和信息资源,可能存在数据质量和可靠性的问题。 该系统的安全性和隐私保护需要进一步研究和解决。 元交互技术在决策支持和知识推理领域有广泛的应用和研究。 [3]
Abstract
Currently, there exists a fundamental divide between the "cognitive black box" (implicit intuition) of human experts and the "computational black box" (untrustworthy decision-making) of artificial intelligence (AI). This paper proposes a new paradigm of "human-AI collaborative cognitive enhancement," aiming to transform the dual black boxes into a composable, auditable, and extensible "functional white-box" system through structured "meta-interaction." The core breakthrough lies in the "plug-and-play cognitive framework"--a computable knowledge package that can be extracted from expert dialogues and loaded into the Recursive Adversarial Meta-Thinking Network (RAMTN). This enables expert thinking, such as medical diagnostic logic and teaching intuition, to be converted into reusable and scalable public assets, realizing a paradigm shift from "AI as a tool" to "AI as a thinking partner." This work not only provides the first engineering proof for "cognitive equity" but also opens up a new path for AI governance: constructing a verifiable and intervenable governance paradigm through "transparency of interaction protocols" rather than prying into the internal mechanisms of models. The framework is open-sourced to promote technology for good and cognitive inclusion. This paper is an independent exploratory research conducted by the author. All content presented, including the theoretical framework (RAMTN), methodology (meta-interaction), system implementation, and case validation, constitutes the author's individual research achievements.
Why we think this paper is great for you:
The concept of human-AI collaboration and cognitive enhancement resonates with the user's interest in programming paradigms and potentially the design of intelligent systems.
Halmstad University
Rate paper: 👍 👎 ♥ Save
AI Summary
  • Multi-agent systems exchange a single 'do-everything' agent for a team of specialised agents that co-operate (or compete) under explicit protocols. [3]
  • Planning- and self-improvement agents: A class of AI systems that use search and optimization techniques to solve complex problems. [3]
  • Embodied and web agents: AI systems that act in the world, either physically (embodied) or through interactions with untrusted websites and enterprise systems (web). [3]
  • Planning- and self-improvement agents can be prone to state explosion, speculative arithmetic errors, and over-confident selection. [3]
  • Planning- and self-improvement agents deliver substantial reliability dividends when their power is channelled through explicit controllers, trustworthy verifiers, and disciplined governance of cost and side-effects. [2]
Abstract
This chapter argues that the reliability of agentic and generative AI is chiefly an architectural property. We define agentic systems as goal-directed, tool-using decision makers operating in closed loops, and show how reliability emerges from principled componentisation (goal manager, planner, tool-router, executor, memory, verifiers, safety monitor, telemetry), disciplined interfaces (schema-constrained, validated, least-privilege tool calls), and explicit control and assurance loops. Building on classical foundations, we propose a practical taxonomy-tool-using agents, memory-augmented agents, planning and self-improvement agents, multi-agent systems, and embodied or web agents - and analyse how each pattern reshapes the reliability envelope and failure modes. We distil design guidance on typed schemas, idempotency, permissioning, transactional semantics, memory provenance and hygiene, runtime governance (budgets, termination conditions), and simulate-before-actuate safeguards.
Why we think this paper is great for you:
The exploration of agentic AI, defined as goal-directed systems, aligns with the user’s interest in designing intelligent systems and potentially the development of autonomous tools.
Rate paper: 👍 👎 ♥ Save
Abstract
Foundation models (FMs) are increasingly assuming the role of the "brain" of AI agents. While recent efforts have begun to equip FMs with native single-agent abilities -- such as GUI interaction or integrated tool use -- we argue that the next frontier is endowing FMs with native multi-agent intelligence. We identify four core capabilities of FMs in multi-agent contexts: understanding, planning, efficient communication, and adaptation. Contrary to assumptions about the spontaneous emergence of such abilities, we provide extensive empirical evidence across 41 large language models showing that strong single-agent performance alone does not automatically yield robust multi-agent intelligence. To address this gap, we outline key research directions -- spanning dataset construction, evaluation, training paradigms, and safety considerations -- for building FMs with native multi-agent intelligence.
Why we think this paper is great for you:
This paper’s focus on equipping foundation models with multi-agent intelligence directly addresses the design of sophisticated, adaptable systems – a key area of interest for the user.
Meta
Rate paper: 👍 👎 ♥ Save
AI Summary
  • SWE-Bench: a comprehensive benchmark to evaluate autonomous code-writing and code-fixing agents on realistic tasks. [3]
  • The combination of monorepo development and LLM-based tools like ECO underscores a trend toward holistic scale: treating an entire organization’s code as a single evolvable system, with AI agents providing the intelligence to manage global changes, dependency analysis, and performance tuning in ways humans alone could not easily scale. [2]
  • Large-scale software engineering has driven interest in AI assistance for code discovery, understanding, and consistent changes at scale. [1]
Abstract
Real-world AI software engineering demands coding agents that can reason over massive repositories, maintain durable memory across and within long sessions, and robustly coordinate complex toolchains at test time. Existing open-source coding agents provide transparency but frequently fall short when pushed to these industrial-scale workloads, while proprietary coding agents offer strong practical performance but limited extensibility, interpretability, and controllability. We present the Confucius Code Agent (CCA), an open-sourced AI software engineer that can operate at an industrial scale. CCA is built atop the Confucius SDK, an open-sourced agent development platform designed around three complementary perspectives: Agent Experience (AX), User Experience (UX), and Developer Experience (DX). The SDK introduces a unified orchestrator with hierarchical working memory for long-context reasoning, a persistent note-taking system for cross-session continual learning, and a modular extension module for robust tool use. Moreover, a meta-agent automates the synthesis, evaluation, and refinement of agent configurations through a build-test-improve loop, enabling rapid agent development on new tasks, environments, and tool stacks. Instantiated on Confucius SDK with these mechanisms, CCA delivers strong performance on real-world software engineering tasks. On SWE-Bench-Pro, CCA achieves a state-of-the-art Resolve@1 performance of 54.3%, substantially improving over prior coding agents. Together, the Confucius SDK and CCA provide a transparent, extensible, and reproducible foundation for AI agents, bridge gaps between research prototypes and production-grade systems, and support agent development and deployment at industrial scale.
Why we think this paper is great for you:
The development of coding agents capable of complex toolchain coordination aligns with the user’s interest in programming language design and potentially the automation of software development tasks.
Programming Paradigms
IEEE
Rate paper: 👍 👎 ♥ Save
Abstract
Industry demands are growing for hyper-distributed applications that span from the cloud to the edge in domains such as smart manufacturing, transportation, and agriculture. Yet today's solutions struggle to meet these demands due to inherent limitations in scalability, interoperability, and trust. In this article, we introduce HERMES (Heterogeneous Computing Continuum with Resource Monetization, Orchestration, and Semantic) - a novel framework designed to transform connectivity and data utilization across the computing continuum. HERMES establishes an open, seamless, and secure environment where resources, from cloud servers to tiny edge devices, can be orchestrated intelligently, data and services can be monetized in a distributed marketplace, and knowledge is shared through semantic interoperability. By bridging these key facets, HERMES lays a foundation for a new generation of distributed applications that are more efficient, trustworthy, and autonomous.
AI Agents
Perplexity
Rate paper: 👍 👎 ♥ Save
Paper visualization
Rate image: 👍 👎
Abstract
This paper presents the first large-scale field study of the adoption, usage intensity, and use cases of general-purpose AI agents operating in open-world web environments. Our analysis centers on Comet, an AI-powered browser developed by Perplexity, and its integrated agent, Comet Assistant. Drawing on hundreds of millions of anonymized user interactions, we address three fundamental questions: Who is using AI agents? How intensively are they using them? And what are they using them for? Our findings reveal substantial heterogeneity in adoption and usage across user segments. Earlier adopters, users in countries with higher GDP per capita and educational attainment, and individuals working in digital or knowledge-intensive sectors -- such as digital technology, academia, finance, marketing, and entrepreneurship -- are more likely to adopt or actively use the agent. To systematically characterize the substance of agent usage, we introduce a hierarchical agentic taxonomy that organizes use cases across three levels: topic, subtopic, and task. The two largest topics, Productivity & Workflow and Learning & Research, account for 57% of all agentic queries, while the two largest subtopics, Courses and Shopping for Goods, make up 22%. The top 10 out of 90 tasks represent 55% of queries. Personal use constitutes 55% of queries, while professional and educational contexts comprise 30% and 16%, respectively. In the short term, use cases exhibit strong stickiness, but over time users tend to shift toward more cognitively oriented topics. The diffusion of increasingly capable AI agents carries important implications for researchers, businesses, policymakers, and educators, inviting new lines of inquiry into this rapidly emerging class of AI capabilities.
AI Summary
  • The agent is used primarily for productivity-related tasks (36% of all queries), followed by learning, media, and shopping. [3]
  • Research, document editing, and shopping-related tasks appear consistently across occupation clusters. [3]
  • Knowledge-intensive sectors like digital technology, entrepreneurship, finance, and academia tend to use the agent for research and learning-related tasks. [3]
  • Productivity and learning topics are the most sticky, while travel is the least sticky. [2]
  • Users' first queries often fall into productivity, learning, or media topics, but over time, there's a shift towards more cognitively oriented use cases. [1]
Research Automation with AI
Peking University
Rate paper: 👍 👎 ♥ Save
Abstract
We investigate how large language models can be used as research tools in scientific computing while preserving mathematical rigor. We propose a human-in-the-loop workflow for interactive theorem proving and discovery with LLMs. Human experts retain control over problem formulation and admissible assumptions, while the model searches for proofs or contradictions, proposes candidate properties and theorems, and helps construct structures and parameters that satisfy explicit constraints, supported by numerical experiments and simple verification checks. Experts treat these outputs as raw material, further refine them, and organize the results into precise statements and rigorous proofs. We instantiate this workflow in a case study on the connection between manifold optimization and Grover's quantum search algorithm, where the pipeline helps identify invariant subspaces, explore Grover-compatible retractions, and obtain convergence guarantees for the retraction-based gradient method. The framework provides a practical template for integrating large language models into frontier mathematical research, enabling faster exploration of proof space and algorithm design while maintaining transparent reasoning responsibilities. Although illustrated on manifold optimization problems in quantum computing, the principles extend to other core areas of scientific computing.
AI Summary
  • Previous research has shown that human-AI collaboration can improve performance in various tasks, including theorem discovery and proof verification. [3]
  • The collaboration between human experts and an LLM is organized into three stages, starting from an informal conjecture and ending with a precise theorem and proof. [2]
  • Human-AI collaboration can significantly improve mathematical proof and theorem discovery. [1]
German Cancer Research
Rate paper: 👍 👎 ♥ Save
Abstract
Developing generalizable AI for medical imaging requires both access to large, multi-center datasets and standardized, reproducible tooling within research environments. However, leveraging real-world imaging data in clinical research environments is still hampered by strict regulatory constraints, fragmented software infrastructure, and the challenges inherent in conducting large-cohort multicentre studies. This leads to projects that rely on ad-hoc toolchains that are hard to reproduce, difficult to scale beyond single institutions and poorly suited for collaboration between clinicians and data scientists. We present Kaapana, a comprehensive open-source platform for medical imaging research that is designed to bridge this gap. Rather than building single-use, site-specific tooling, Kaapana provides a modular, extensible framework that unifies data ingestion, cohort curation, processing workflows and result inspection under a common user interface. By bringing the algorithm to the data, it enables institutions to keep control over their sensitive data while still participating in distributed experimentation and model development. By integrating flexible workflow orchestration with user-facing applications for researchers, Kaapana reduces technical overhead, improves reproducibility and enables conducting large-scale, collaborative, multi-centre imaging studies. We describe the core concepts of the platform and illustrate how they can support diverse use cases, from local prototyping to nation-wide research networks. The open-source codebase is available at https://github.com/kaapana/kaapana
Deep Learning
Universidad de Guanajuato
Rate paper: 👍 👎 ♥ Save
Paper visualization
Rate image: 👍 👎
Abstract
This document reports the sequence of practices and methodologies implemented during the Big Data course. It details the workflow beginning with the processing of the Epsilon dataset through group and individual strategies, followed by text analysis and classification with RestMex and movie feature analysis with IMDb. Finally, it describes the technical implementation of a distributed computing cluster with Apache Spark on Linux using Scala.
AI Summary
  • In the big data era, data completeness can be as important as algorithm sophistication. [3]
  • Big Data Analytics Distributed Computing Scalability Algorithm Sophistication Data Completeness The chronological progression demonstrates that mastering big data requires a systematic approach. [3]
  • The choice between local and distributed architectures is not merely about computational resources, but about the quality and completeness of the data available to the model. [2]
National University of
Rate paper: 👍 👎 ♥ Save
Abstract
Accurate forecasting of urban air pollution is essential for protecting public health and guiding mitigation policies. While Deep Learning (DL) and hybrid pipelines dominate recent research, their complexity and limited interpretability hinder operational use. This study investigates whether lightweight additive models -- Facebook Prophet (FBP) and NeuralProphet (NP) -- can deliver competitive forecasts for particulate matter (PM$_{2.5}$, PM$_{10}$) in Beijing, China. Using multi-year pollutant and meteorological data, we applied systematic feature selection (correlation, mutual information, mRMR), leakage-safe scaling, and chronological data splits. Both models were trained with pollutant and precursor regressors, with NP additionally leveraging lagged dependencies. For context, two machine learning baselines (LSTM, LightGBM) and one traditional statistical model (SARIMAX) were also implemented. Performance was evaluated on a 7-day holdout using MAE, RMSE, and $R^2$. Results show that FBP consistently outperformed NP, SARIMAX, and the learning-based baselines, achieving test $R^2$ above 0.94 for both pollutants. These findings demonstrate that interpretable additive models remain competitive with both traditional and complex approaches, offering a practical balance of accuracy, transparency, and ease of deployment.
AI Summary
  • The study also explores the impact of different input features on the performance of the models and finds that using both air quality index and weather data improves the predictive power of the models. [3]
  • AQI: Air Quality Index MAE: Mean Absolute Error The study demonstrates the effectiveness of machine learning models in predicting AQIs and highlights the importance of using both air quality index and weather data for improved predictive power. [3]
  • The results of this study can be used to inform policy decisions related to air pollution control and mitigation strategies. [3]
  • The study only evaluates the performance of different models on a single dataset and does not explore the generalizability of the results to other locations or datasets. [3]
  • The authors do not provide any discussion on the limitations of the study, such as the potential impact of data quality issues or the lack of consideration for non-linear relationships between input features. [3]
  • The paper presents a comparative study of various machine learning models for predicting air quality indices (AQIs) in Beijing, China. [2]
  • The results show that the Prophet model outperforms other models in terms of accuracy, with a mean absolute error (MAE) of 4.35 μg/m³. [1]

We did not find tons of content matching your interests we've included some additional topics that are popular. Also be aware that if the topics is not present in arxiv we wont be able to recommend it.

AI Agents
Halmstad University
Rate paper: 👍 👎 ♥ Save
Abstract
This chapter argues that the reliability of agentic and generative AI is chiefly an architectural property. We define agentic systems as goal-directed, tool-using decision makers operating in closed loops, and show how reliability emerges from principled componentisation (goal manager, planner, tool-router, executor, memory, verifiers, safety monitor, telemetry), disciplined interfaces (schema-constrained, validated, least-privilege tool calls), and explicit control and assurance loops. Building on classical foundations, we propose a practical taxonomy-tool-using agents, memory-augmented agents, planning and self-improvement agents, multi-agent systems, and embodied or web agents - and analyse how each pattern reshapes the reliability envelope and failure modes. We distil design guidance on typed schemas, idempotency, permissioning, transactional semantics, memory provenance and hygiene, runtime governance (budgets, termination conditions), and simulate-before-actuate safeguards.
AI Summary
  • Multi-agent systems exchange a single 'do-everything' agent for a team of specialised agents that co-operate (or compete) under explicit protocols. [3]
  • Planning- and self-improvement agents: A class of AI systems that use search and optimization techniques to solve complex problems. [3]
  • Embodied and web agents: AI systems that act in the world, either physically (embodied) or through interactions with untrusted websites and enterprise systems (web). [3]
  • Planning- and self-improvement agents can be prone to state explosion, speculative arithmetic errors, and over-confident selection. [3]
  • Planning- and self-improvement agents deliver substantial reliability dividends when their power is channelled through explicit controllers, trustworthy verifiers, and disciplined governance of cost and side-effects. [2]
Perplexity
Rate paper: 👍 👎 ♥ Save
Abstract
This paper presents the first large-scale field study of the adoption, usage intensity, and use cases of general-purpose AI agents operating in open-world web environments. Our analysis centers on Comet, an AI-powered browser developed by Perplexity, and its integrated agent, Comet Assistant. Drawing on hundreds of millions of anonymized user interactions, we address three fundamental questions: Who is using AI agents? How intensively are they using them? And what are they using them for? Our findings reveal substantial heterogeneity in adoption and usage across user segments. Earlier adopters, users in countries with higher GDP per capita and educational attainment, and individuals working in digital or knowledge-intensive sectors -- such as digital technology, academia, finance, marketing, and entrepreneurship -- are more likely to adopt or actively use the agent. To systematically characterize the substance of agent usage, we introduce a hierarchical agentic taxonomy that organizes use cases across three levels: topic, subtopic, and task. The two largest topics, Productivity & Workflow and Learning & Research, account for 57% of all agentic queries, while the two largest subtopics, Courses and Shopping for Goods, make up 22%. The top 10 out of 90 tasks represent 55% of queries. Personal use constitutes 55% of queries, while professional and educational contexts comprise 30% and 16%, respectively. In the short term, use cases exhibit strong stickiness, but over time users tend to shift toward more cognitively oriented topics. The diffusion of increasingly capable AI agents carries important implications for researchers, businesses, policymakers, and educators, inviting new lines of inquiry into this rapidly emerging class of AI capabilities.
AI Summary
  • The agent is used primarily for productivity-related tasks (36% of all queries), followed by learning, media, and shopping. [3]
  • Research, document editing, and shopping-related tasks appear consistently across occupation clusters. [3]
  • Knowledge-intensive sectors like digital technology, entrepreneurship, finance, and academia tend to use the agent for research and learning-related tasks. [3]
  • Productivity and learning topics are the most sticky, while travel is the least sticky. [2]
  • Users' first queries often fall into productivity, learning, or media topics, but over time, there's a shift towards more cognitively oriented use cases. [1]
AI and Society
Northeastern University
Rate paper: 👍 👎 ♥ Save
Abstract
Currently, there exists a fundamental divide between the "cognitive black box" (implicit intuition) of human experts and the "computational black box" (untrustworthy decision-making) of artificial intelligence (AI). This paper proposes a new paradigm of "human-AI collaborative cognitive enhancement," aiming to transform the dual black boxes into a composable, auditable, and extensible "functional white-box" system through structured "meta-interaction." The core breakthrough lies in the "plug-and-play cognitive framework"--a computable knowledge package that can be extracted from expert dialogues and loaded into the Recursive Adversarial Meta-Thinking Network (RAMTN). This enables expert thinking, such as medical diagnostic logic and teaching intuition, to be converted into reusable and scalable public assets, realizing a paradigm shift from "AI as a tool" to "AI as a thinking partner." This work not only provides the first engineering proof for "cognitive equity" but also opens up a new path for AI governance: constructing a verifiable and intervenable governance paradigm through "transparency of interaction protocols" rather than prying into the internal mechanisms of models. The framework is open-sourced to promote technology for good and cognitive inclusion. This paper is an independent exploratory research conducted by the author. All content presented, including the theoretical framework (RAMTN), methodology (meta-interaction), system implementation, and case validation, constitutes the author's individual research achievements.
AI Summary
  • RAMTN系统是一种基于元交互的人机协作认知增强范式,旨在通过提取专家决策框架来实现智能辅助和知识共享。 该系统的核心思想是将人类专家的认知过程与计算机系统的信息处理能力结合起来,从而实现高效的决策支持和知识推理。 RAMTN系统的应用领域包括投资、医疗和教育等多个领域,旨在通过提取专家决策框架来提高决策准确性和效率。 元交互(Meta-Interaction):一种将人类认知过程与计算机系统信息处理能力结合起来的技术,旨在实现高效的决策支持和知识推理。 人机协作认知增强范式(Human-Machine Collaborative Cognition Enhancement Paradigm):一种基于元交互的框架,旨在通过提取专家决策框架来实现智能辅助和知识共享。 RAMTN系统是一种创新性的解决方案,旨在通过提取专家决策框架来提高决策准确性和效率。 该系统的应用领域包括投资、医疗和教育等多个领域,具有广泛的潜力和前景。 该系统的开发和应用依赖于大量的数据和信息资源,可能存在数据质量和可靠性的问题。 该系统的安全性和隐私保护需要进一步研究和解决。 元交互技术在决策支持和知识推理领域有广泛的应用和研究。 [3]
Research Automation with AI
Peking University
Rate paper: 👍 👎 ♥ Save
Abstract
We investigate how large language models can be used as research tools in scientific computing while preserving mathematical rigor. We propose a human-in-the-loop workflow for interactive theorem proving and discovery with LLMs. Human experts retain control over problem formulation and admissible assumptions, while the model searches for proofs or contradictions, proposes candidate properties and theorems, and helps construct structures and parameters that satisfy explicit constraints, supported by numerical experiments and simple verification checks. Experts treat these outputs as raw material, further refine them, and organize the results into precise statements and rigorous proofs. We instantiate this workflow in a case study on the connection between manifold optimization and Grover's quantum search algorithm, where the pipeline helps identify invariant subspaces, explore Grover-compatible retractions, and obtain convergence guarantees for the retraction-based gradient method. The framework provides a practical template for integrating large language models into frontier mathematical research, enabling faster exploration of proof space and algorithm design while maintaining transparent reasoning responsibilities. Although illustrated on manifold optimization problems in quantum computing, the principles extend to other core areas of scientific computing.
AI Summary
  • Previous research has shown that human-AI collaboration can improve performance in various tasks, including theorem discovery and proof verification. [3]
  • The collaboration between human experts and an LLM is organized into three stages, starting from an informal conjecture and ending with a precise theorem and proof. [2]
  • Human-AI collaboration can significantly improve mathematical proof and theorem discovery. [1]
German Cancer Research
Rate paper: 👍 👎 ♥ Save
Abstract
Developing generalizable AI for medical imaging requires both access to large, multi-center datasets and standardized, reproducible tooling within research environments. However, leveraging real-world imaging data in clinical research environments is still hampered by strict regulatory constraints, fragmented software infrastructure, and the challenges inherent in conducting large-cohort multicentre studies. This leads to projects that rely on ad-hoc toolchains that are hard to reproduce, difficult to scale beyond single institutions and poorly suited for collaboration between clinicians and data scientists. We present Kaapana, a comprehensive open-source platform for medical imaging research that is designed to bridge this gap. Rather than building single-use, site-specific tooling, Kaapana provides a modular, extensible framework that unifies data ingestion, cohort curation, processing workflows and result inspection under a common user interface. By bringing the algorithm to the data, it enables institutions to keep control over their sensitive data while still participating in distributed experimentation and model development. By integrating flexible workflow orchestration with user-facing applications for researchers, Kaapana reduces technical overhead, improves reproducibility and enables conducting large-scale, collaborative, multi-centre imaging studies. We describe the core concepts of the platform and illustrate how they can support diverse use cases, from local prototyping to nation-wide research networks. The open-source codebase is available at https://github.com/kaapana/kaapana
AGI: Artificial General Intelligence
Meta
Rate paper: 👍 👎 ♥ Save
Abstract
Real-world AI software engineering demands coding agents that can reason over massive repositories, maintain durable memory across and within long sessions, and robustly coordinate complex toolchains at test time. Existing open-source coding agents provide transparency but frequently fall short when pushed to these industrial-scale workloads, while proprietary coding agents offer strong practical performance but limited extensibility, interpretability, and controllability. We present the Confucius Code Agent (CCA), an open-sourced AI software engineer that can operate at an industrial scale. CCA is built atop the Confucius SDK, an open-sourced agent development platform designed around three complementary perspectives: Agent Experience (AX), User Experience (UX), and Developer Experience (DX). The SDK introduces a unified orchestrator with hierarchical working memory for long-context reasoning, a persistent note-taking system for cross-session continual learning, and a modular extension module for robust tool use. Moreover, a meta-agent automates the synthesis, evaluation, and refinement of agent configurations through a build-test-improve loop, enabling rapid agent development on new tasks, environments, and tool stacks. Instantiated on Confucius SDK with these mechanisms, CCA delivers strong performance on real-world software engineering tasks. On SWE-Bench-Pro, CCA achieves a state-of-the-art Resolve@1 performance of 54.3%, substantially improving over prior coding agents. Together, the Confucius SDK and CCA provide a transparent, extensible, and reproducible foundation for AI agents, bridge gaps between research prototypes and production-grade systems, and support agent development and deployment at industrial scale.
AI Summary
  • SWE-Bench: a comprehensive benchmark to evaluate autonomous code-writing and code-fixing agents on realistic tasks. [3]
  • The combination of monorepo development and LLM-based tools like ECO underscores a trend toward holistic scale: treating an entire organization’s code as a single evolvable system, with AI agents providing the intelligence to manage global changes, dependency analysis, and performance tuning in ways humans alone could not easily scale. [2]
  • Large-scale software engineering has driven interest in AI assistance for code discovery, understanding, and consistent changes at scale. [1]
Rate paper: 👍 👎 ♥ Save
Abstract
Foundation models (FMs) are increasingly assuming the role of the "brain" of AI agents. While recent efforts have begun to equip FMs with native single-agent abilities -- such as GUI interaction or integrated tool use -- we argue that the next frontier is endowing FMs with native multi-agent intelligence. We identify four core capabilities of FMs in multi-agent contexts: understanding, planning, efficient communication, and adaptation. Contrary to assumptions about the spontaneous emergence of such abilities, we provide extensive empirical evidence across 41 large language models showing that strong single-agent performance alone does not automatically yield robust multi-agent intelligence. To address this gap, we outline key research directions -- spanning dataset construction, evaluation, training paradigms, and safety considerations -- for building FMs with native multi-agent intelligence.
Deep Learning
Universidad de Guanajuato
Rate paper: 👍 👎 ♥ Save
Abstract
This document reports the sequence of practices and methodologies implemented during the Big Data course. It details the workflow beginning with the processing of the Epsilon dataset through group and individual strategies, followed by text analysis and classification with RestMex and movie feature analysis with IMDb. Finally, it describes the technical implementation of a distributed computing cluster with Apache Spark on Linux using Scala.
AI Summary
  • In the big data era, data completeness can be as important as algorithm sophistication. [3]
  • Big Data Analytics Distributed Computing Scalability Algorithm Sophistication Data Completeness The chronological progression demonstrates that mastering big data requires a systematic approach. [3]
  • The choice between local and distributed architectures is not merely about computational resources, but about the quality and completeness of the data available to the model. [2]
National University of
Rate paper: 👍 👎 ♥ Save
Abstract
Accurate forecasting of urban air pollution is essential for protecting public health and guiding mitigation policies. While Deep Learning (DL) and hybrid pipelines dominate recent research, their complexity and limited interpretability hinder operational use. This study investigates whether lightweight additive models -- Facebook Prophet (FBP) and NeuralProphet (NP) -- can deliver competitive forecasts for particulate matter (PM$_{2.5}$, PM$_{10}$) in Beijing, China. Using multi-year pollutant and meteorological data, we applied systematic feature selection (correlation, mutual information, mRMR), leakage-safe scaling, and chronological data splits. Both models were trained with pollutant and precursor regressors, with NP additionally leveraging lagged dependencies. For context, two machine learning baselines (LSTM, LightGBM) and one traditional statistical model (SARIMAX) were also implemented. Performance was evaluated on a 7-day holdout using MAE, RMSE, and $R^2$. Results show that FBP consistently outperformed NP, SARIMAX, and the learning-based baselines, achieving test $R^2$ above 0.94 for both pollutants. These findings demonstrate that interpretable additive models remain competitive with both traditional and complex approaches, offering a practical balance of accuracy, transparency, and ease of deployment.
AI Summary
  • The study also explores the impact of different input features on the performance of the models and finds that using both air quality index and weather data improves the predictive power of the models. [3]
  • AQI: Air Quality Index MAE: Mean Absolute Error The study demonstrates the effectiveness of machine learning models in predicting AQIs and highlights the importance of using both air quality index and weather data for improved predictive power. [3]
  • The results of this study can be used to inform policy decisions related to air pollution control and mitigation strategies. [3]
  • The study only evaluates the performance of different models on a single dataset and does not explore the generalizability of the results to other locations or datasets. [3]
  • The authors do not provide any discussion on the limitations of the study, such as the potential impact of data quality issues or the lack of consideration for non-linear relationships between input features. [3]
  • The paper presents a comparative study of various machine learning models for predicting air quality indices (AQIs) in Beijing, China. [2]
  • The results show that the Prophet model outperforms other models in terms of accuracy, with a mean absolute error (MAE) of 4.35 μg/m³. [1]

Interests not found

We did not find any papers that match the below interests. Try other terms also consider if the content exists in arxiv.org.
  • Design Patterns
  • Object Oriented Programming
You can edit or add more interests any time.