🎯 Top Personalized Recommendations
National University of
AI Summary - NeurIDA outperforms all baselines across various tasks and datasets. [3]
- NeurIDA achieves state-of-the-art results in classification (AUC-ROC) and regression (MAE) metrics. [3]
- The effectiveness of NeurIDA is demonstrated through its ability to improve upon existing models, including those with automated learning approaches. [3]
- AUC-ROC: Area Under the Receiver Operating Characteristic Curve, a metric used for classification tasks to evaluate model performance. [3]
- MAE: Mean Absolute Error, a metric used for regression tasks to evaluate model performance. [3]
- Early Stopping: A technique used in training neural networks where the training process is stopped when the model's performance on the validation set starts to degrade. [3]
- NeurIDA demonstrates superior performance across various classification and regression tasks, outperforming existing models with or without automated learning approaches. [3]
- The effectiveness of NeurIDA can be attributed to its ability to incorporate relational structure into tuple representations, leading to improved model performance. [3]
- The paper does not provide an extensive analysis of the computational complexity of NeurIDA. [3]
- The evaluation metrics used are limited to AUC-ROC and MAE, which may not capture other aspects of model performance. [3]
Abstract
Relational Database Management Systems (RDBMS) manage complex, interrelated data and support a broad spectrum of analytical tasks. With the growing demand for predictive analytics, the deep integration of machine learning (ML) into RDBMS has become critical. However, a fundamental challenge hinders this evolution: conventional ML models are static and task-specific, whereas RDBMS environments are dynamic and must support diverse analytical queries. Each analytical task entails constructing a bespoke pipeline from scratch, which incurs significant development overhead and hence limits wide adoption of ML in analytics.
We present NeurIDA, an autonomous end-to-end system for in-database analytics that dynamically "tweaks" the best available base model to better serve a given analytical task. In particular, we propose a novel paradigm of dynamic in-database modeling to pre-train a composable base model architecture over the relational data. Upon receiving a task, NeurIDA formulates the task and data profile to dynamically select and configure relevant components from the pool of base models and shared model components for prediction. For friendly user experience, NeurIDA supports natural language queries; it interprets user intent to construct structured task profiles, and generates analytical reports with dedicated LLM agents. By design, NeurIDA enables ease-of-use and yet effective and efficient in-database AI analytics. Extensive experiment study shows that NeurIDA consistently delivers up to 12% improvement in AUC-ROC and 25% relative reduction in MAE across ten tasks on five real-world datasets. The source code is available at https://github.com/Zrealshadow/NeurIDA
Why we think this paper is great for you:
This paper directly addresses data warehousing and analytics, aligning with the user’s stated interest in database design and data warehousing. The focus on integrating machine learning within RDBMS is a key area of interest for the user.
Kiel University
AI Summary - The problem of determining whether a given word belongs to the language generated by a relational pattern is in NP. [2]
Abstract
Patterns are words with terminals and variables. The language of a pattern is the set of words obtained by uniformly substituting all variables with words that contain only terminals. In their original definition, patterns only allow for multiple distinct occurrences of some variables to be related by the equality relation, represented by using the same variable multiple times. In an extended notion, called relational patterns and relational pattern languages, variables may be related by arbitrary other relations. We extend the ongoing investigation of the main decision problems for patterns (namely, the membership problem, the inclusion problem, and the equivalence problem) to relational pattern languages under a wide range of individual relations. It is shown show that - even for many much simpler or less restrictive relations - the complexity and (un)decidability characteristics of these problems do not change compared to the classical case where variables are related only by equality.
Why we think this paper is great for you:
The paper’s exploration of pattern languages within relational databases directly relates to the user’s interest in SQL and relational database design. Understanding patterns is fundamental to effective database modeling.
Adobe Research
Abstract
Humans do not just see attribute similarity -- we also see relational similarity. An apple is like a peach because both are reddish fruit, but the Earth is also like a peach: its crust, mantle, and core correspond to the peach's skin, flesh, and pit. This ability to perceive and recognize relational similarity, is arguable by cognitive scientist to be what distinguishes humans from other species. Yet, all widely used visual similarity metrics today (e.g., LPIPS, CLIP, DINO) focus solely on perceptual attribute similarity and fail to capture the rich, often surprising relational similarities that humans perceive. How can we go beyond the visible content of an image to capture its relational properties? How can we bring images with the same relational logic closer together in representation space? To answer these questions, we first formulate relational image similarity as a measurable problem: two images are relationally similar when their internal relations or functions among visual elements correspond, even if their visual attributes differ. We then curate 114k image-caption dataset in which the captions are anonymized -- describing the underlying relational logic of the scene rather than its surface content. Using this dataset, we finetune a Vision-Language model to measure the relational similarity between images. This model serves as the first step toward connecting images by their underlying relational structure rather than their visible appearance. Our study shows that while relational similarity has a lot of real-world applications, existing image similarity models fail to capture it -- revealing a critical gap in visual computing.
Why we think this paper is great for you:
The concept of relational similarity is crucial for understanding how data is structured and related, aligning with the user’s interest in database design and the ability to recognize relationships within data.
German Cancer Research
Abstract
Developing generalizable AI for medical imaging requires both access to large, multi-center datasets and standardized, reproducible tooling within research environments. However, leveraging real-world imaging data in clinical research environments is still hampered by strict regulatory constraints, fragmented software infrastructure, and the challenges inherent in conducting large-cohort multicentre studies. This leads to projects that rely on ad-hoc toolchains that are hard to reproduce, difficult to scale beyond single institutions and poorly suited for collaboration between clinicians and data scientists. We present Kaapana, a comprehensive open-source platform for medical imaging research that is designed to bridge this gap. Rather than building single-use, site-specific tooling, Kaapana provides a modular, extensible framework that unifies data ingestion, cohort curation, processing workflows and result inspection under a common user interface. By bringing the algorithm to the data, it enables institutions to keep control over their sensitive data while still participating in distributed experimentation and model development. By integrating flexible workflow orchestration with user-facing applications for researchers, Kaapana reduces technical overhead, improves reproducibility and enables conducting large-scale, collaborative, multi-centre imaging studies. We describe the core concepts of the platform and illustrate how they can support diverse use cases, from local prototyping to nation-wide research networks. The open-source codebase is available at https://github.com/kaapana/kaapana
Why we think this paper is great for you:
This paper focuses on AI in medical imaging, a field that can leverage the user’s database knowledge for data analysis and research applications.
Halmstad University
AI Summary - Multi-agent systems exchange a single 'do-everything' agent for a team of specialised agents that co-operate (or compete) under explicit protocols. [3]
- Planning- and self-improvement agents: A class of AI systems that use search and optimization techniques to solve complex problems. [3]
- Embodied and web agents: AI systems that act in the world, either physically (embodied) or through interactions with untrusted websites and enterprise systems (web). [3]
- Planning- and self-improvement agents can be prone to state explosion, speculative arithmetic errors, and over-confident selection. [3]
- Planning- and self-improvement agents deliver substantial reliability dividends when their power is channelled through explicit controllers, trustworthy verifiers, and disciplined governance of cost and side-effects. [2]
Abstract
This chapter argues that the reliability of agentic and generative AI is chiefly an architectural property. We define agentic systems as goal-directed, tool-using decision makers operating in closed loops, and show how reliability emerges from principled componentisation (goal manager, planner, tool-router, executor, memory, verifiers, safety monitor, telemetry), disciplined interfaces (schema-constrained, validated, least-privilege tool calls), and explicit control and assurance loops. Building on classical foundations, we propose a practical taxonomy-tool-using agents, memory-augmented agents, planning and self-improvement agents, multi-agent systems, and embodied or web agents - and analyse how each pattern reshapes the reliability envelope and failure modes. We distil design guidance on typed schemas, idempotency, permissioning, transactional semantics, memory provenance and hygiene, runtime governance (budgets, termination conditions), and simulate-before-actuate safeguards.
Why we think this paper is great for you:
The exploration of agentic AI architectures aligns with the user's interest in intelligent systems and how they can be built using database-like principles.
Abstract
Foundation models (FMs) are increasingly assuming the role of the "brain" of AI agents. While recent efforts have begun to equip FMs with native single-agent abilities -- such as GUI interaction or integrated tool use -- we argue that the next frontier is endowing FMs with native multi-agent intelligence. We identify four core capabilities of FMs in multi-agent contexts: understanding, planning, efficient communication, and adaptation. Contrary to assumptions about the spontaneous emergence of such abilities, we provide extensive empirical evidence across 41 large language models showing that strong single-agent performance alone does not automatically yield robust multi-agent intelligence. To address this gap, we outline key research directions -- spanning dataset construction, evaluation, training paradigms, and safety considerations -- for building FMs with native multi-agent intelligence.
Why we think this paper is great for you:
This paper’s focus on multi-agent intelligence within foundation models is a forward-looking area that could be highly relevant to the user’s interest in building intelligent systems.
Peking University
AI Summary - Previous research has shown that human-AI collaboration can improve performance in various tasks, including theorem discovery and proof verification. [3]
- The collaboration between human experts and an LLM is organized into three stages, starting from an informal conjecture and ending with a precise theorem and proof. [2]
- Human-AI collaboration can significantly improve mathematical proof and theorem discovery. [1]
Abstract
We investigate how large language models can be used as research tools in scientific computing while preserving mathematical rigor. We propose a human-in-the-loop workflow for interactive theorem proving and discovery with LLMs. Human experts retain control over problem formulation and admissible assumptions, while the model searches for proofs or contradictions, proposes candidate properties and theorems, and helps construct structures and parameters that satisfy explicit constraints, supported by numerical experiments and simple verification checks. Experts treat these outputs as raw material, further refine them, and organize the results into precise statements and rigorous proofs. We instantiate this workflow in a case study on the connection between manifold optimization and Grover's quantum search algorithm, where the pipeline helps identify invariant subspaces, explore Grover-compatible retractions, and obtain convergence guarantees for the retraction-based gradient method. The framework provides a practical template for integrating large language models into frontier mathematical research, enabling faster exploration of proof space and algorithm design while maintaining transparent reasoning responsibilities. Although illustrated on manifold optimization problems in quantum computing, the principles extend to other core areas of scientific computing.
Why we think this paper is great for you:
The use of large language models for interactive theorem proving represents a sophisticated application of AI that can be combined with the user’s database knowledge for rigorous problem-solving.
AI Agents
Perplexity
Abstract
This paper presents the first large-scale field study of the adoption, usage intensity, and use cases of general-purpose AI agents operating in open-world web environments. Our analysis centers on Comet, an AI-powered browser developed by Perplexity, and its integrated agent, Comet Assistant. Drawing on hundreds of millions of anonymized user interactions, we address three fundamental questions: Who is using AI agents? How intensively are they using them? And what are they using them for? Our findings reveal substantial heterogeneity in adoption and usage across user segments. Earlier adopters, users in countries with higher GDP per capita and educational attainment, and individuals working in digital or knowledge-intensive sectors -- such as digital technology, academia, finance, marketing, and entrepreneurship -- are more likely to adopt or actively use the agent. To systematically characterize the substance of agent usage, we introduce a hierarchical agentic taxonomy that organizes use cases across three levels: topic, subtopic, and task. The two largest topics, Productivity & Workflow and Learning & Research, account for 57% of all agentic queries, while the two largest subtopics, Courses and Shopping for Goods, make up 22%. The top 10 out of 90 tasks represent 55% of queries. Personal use constitutes 55% of queries, while professional and educational contexts comprise 30% and 16%, respectively. In the short term, use cases exhibit strong stickiness, but over time users tend to shift toward more cognitively oriented topics. The diffusion of increasingly capable AI agents carries important implications for researchers, businesses, policymakers, and educators, inviting new lines of inquiry into this rapidly emerging class of AI capabilities.
AI Summary - The agent is used primarily for productivity-related tasks (36% of all queries), followed by learning, media, and shopping. [3]
- Research, document editing, and shopping-related tasks appear consistently across occupation clusters. [3]
- Knowledge-intensive sectors like digital technology, entrepreneurship, finance, and academia tend to use the agent for research and learning-related tasks. [3]
- Productivity and learning topics are the most sticky, while travel is the least sticky. [2]
- Users' first queries often fall into productivity, learning, or media topics, but over time, there's a shift towards more cognitively oriented use cases. [1]
AGI: Artificial General Intelligence
Meta
Abstract
Real-world AI software engineering demands coding agents that can reason over massive repositories, maintain durable memory across and within long sessions, and robustly coordinate complex toolchains at test time. Existing open-source coding agents provide transparency but frequently fall short when pushed to these industrial-scale workloads, while proprietary coding agents offer strong practical performance but limited extensibility, interpretability, and controllability. We present the Confucius Code Agent (CCA), an open-sourced AI software engineer that can operate at an industrial scale. CCA is built atop the Confucius SDK, an open-sourced agent development platform designed around three complementary perspectives: Agent Experience (AX), User Experience (UX), and Developer Experience (DX). The SDK introduces a unified orchestrator with hierarchical working memory for long-context reasoning, a persistent note-taking system for cross-session continual learning, and a modular extension module for robust tool use. Moreover, a meta-agent automates the synthesis, evaluation, and refinement of agent configurations through a build-test-improve loop, enabling rapid agent development on new tasks, environments, and tool stacks. Instantiated on Confucius SDK with these mechanisms, CCA delivers strong performance on real-world software engineering tasks. On SWE-Bench-Pro, CCA achieves a state-of-the-art Resolve@1 performance of 54.3%, substantially improving over prior coding agents. Together, the Confucius SDK and CCA provide a transparent, extensible, and reproducible foundation for AI agents, bridge gaps between research prototypes and production-grade systems, and support agent development and deployment at industrial scale.
AI Summary - SWE-Bench: a comprehensive benchmark to evaluate autonomous code-writing and code-fixing agents on realistic tasks. [3]
- The combination of monorepo development and LLM-based tools like ECO underscores a trend toward holistic scale: treating an entire organization’s code as a single evolvable system, with AI agents providing the intelligence to manage global changes, dependency analysis, and performance tuning in ways humans alone could not easily scale. [2]
- Large-scale software engineering has driven interest in AI assistance for code discovery, understanding, and consistent changes at scale. [1]
Deep Learning
Universidad de Guanajuato
Abstract
This document reports the sequence of practices and methodologies implemented during the Big Data course. It details the workflow beginning with the processing of the Epsilon dataset through group and individual strategies, followed by text analysis and classification with RestMex and movie feature analysis with IMDb. Finally, it describes the technical implementation of a distributed computing cluster with Apache Spark on Linux using Scala.
AI Summary - In the big data era, data completeness can be as important as algorithm sophistication. [3]
- Big Data Analytics Distributed Computing Scalability Algorithm Sophistication Data Completeness The chronological progression demonstrates that mastering big data requires a systematic approach. [3]
- The choice between local and distributed architectures is not merely about computational resources, but about the quality and completeness of the data available to the model. [2]
National University of
Abstract
Accurate forecasting of urban air pollution is essential for protecting public health and guiding mitigation policies. While Deep Learning (DL) and hybrid pipelines dominate recent research, their complexity and limited interpretability hinder operational use. This study investigates whether lightweight additive models -- Facebook Prophet (FBP) and NeuralProphet (NP) -- can deliver competitive forecasts for particulate matter (PM$_{2.5}$, PM$_{10}$) in Beijing, China. Using multi-year pollutant and meteorological data, we applied systematic feature selection (correlation, mutual information, mRMR), leakage-safe scaling, and chronological data splits. Both models were trained with pollutant and precursor regressors, with NP additionally leveraging lagged dependencies. For context, two machine learning baselines (LSTM, LightGBM) and one traditional statistical model (SARIMAX) were also implemented. Performance was evaluated on a 7-day holdout using MAE, RMSE, and $R^2$. Results show that FBP consistently outperformed NP, SARIMAX, and the learning-based baselines, achieving test $R^2$ above 0.94 for both pollutants. These findings demonstrate that interpretable additive models remain competitive with both traditional and complex approaches, offering a practical balance of accuracy, transparency, and ease of deployment.
AI Summary - The study also explores the impact of different input features on the performance of the models and finds that using both air quality index and weather data improves the predictive power of the models. [3]
- AQI: Air Quality Index MAE: Mean Absolute Error The study demonstrates the effectiveness of machine learning models in predicting AQIs and highlights the importance of using both air quality index and weather data for improved predictive power. [3]
- The results of this study can be used to inform policy decisions related to air pollution control and mitigation strategies. [3]
- The study only evaluates the performance of different models on a single dataset and does not explore the generalizability of the results to other locations or datasets. [3]
- The authors do not provide any discussion on the limitations of the study, such as the potential impact of data quality issues or the lack of consideration for non-linear relationships between input features. [3]
- The paper presents a comparative study of various machine learning models for predicting air quality indices (AQIs) in Beijing, China. [2]
- The results show that the Prophet model outperforms other models in terms of accuracy, with a mean absolute error (MAE) of 4.35 μg/m³. [1]
We did not find tons of content matching your interests we've included some additional topics that are popular.
Also be aware that if the topics is not present in arxiv we wont be able to recommend it.
AI Agents
Halmstad University
Abstract
This chapter argues that the reliability of agentic and generative AI is chiefly an architectural property. We define agentic systems as goal-directed, tool-using decision makers operating in closed loops, and show how reliability emerges from principled componentisation (goal manager, planner, tool-router, executor, memory, verifiers, safety monitor, telemetry), disciplined interfaces (schema-constrained, validated, least-privilege tool calls), and explicit control and assurance loops. Building on classical foundations, we propose a practical taxonomy-tool-using agents, memory-augmented agents, planning and self-improvement agents, multi-agent systems, and embodied or web agents - and analyse how each pattern reshapes the reliability envelope and failure modes. We distil design guidance on typed schemas, idempotency, permissioning, transactional semantics, memory provenance and hygiene, runtime governance (budgets, termination conditions), and simulate-before-actuate safeguards.
AI Summary - Multi-agent systems exchange a single 'do-everything' agent for a team of specialised agents that co-operate (or compete) under explicit protocols. [3]
- Planning- and self-improvement agents: A class of AI systems that use search and optimization techniques to solve complex problems. [3]
- Embodied and web agents: AI systems that act in the world, either physically (embodied) or through interactions with untrusted websites and enterprise systems (web). [3]
- Planning- and self-improvement agents can be prone to state explosion, speculative arithmetic errors, and over-confident selection. [3]
- Planning- and self-improvement agents deliver substantial reliability dividends when their power is channelled through explicit controllers, trustworthy verifiers, and disciplined governance of cost and side-effects. [2]
Perplexity
Abstract
This paper presents the first large-scale field study of the adoption, usage intensity, and use cases of general-purpose AI agents operating in open-world web environments. Our analysis centers on Comet, an AI-powered browser developed by Perplexity, and its integrated agent, Comet Assistant. Drawing on hundreds of millions of anonymized user interactions, we address three fundamental questions: Who is using AI agents? How intensively are they using them? And what are they using them for? Our findings reveal substantial heterogeneity in adoption and usage across user segments. Earlier adopters, users in countries with higher GDP per capita and educational attainment, and individuals working in digital or knowledge-intensive sectors -- such as digital technology, academia, finance, marketing, and entrepreneurship -- are more likely to adopt or actively use the agent. To systematically characterize the substance of agent usage, we introduce a hierarchical agentic taxonomy that organizes use cases across three levels: topic, subtopic, and task. The two largest topics, Productivity & Workflow and Learning & Research, account for 57% of all agentic queries, while the two largest subtopics, Courses and Shopping for Goods, make up 22%. The top 10 out of 90 tasks represent 55% of queries. Personal use constitutes 55% of queries, while professional and educational contexts comprise 30% and 16%, respectively. In the short term, use cases exhibit strong stickiness, but over time users tend to shift toward more cognitively oriented topics. The diffusion of increasingly capable AI agents carries important implications for researchers, businesses, policymakers, and educators, inviting new lines of inquiry into this rapidly emerging class of AI capabilities.
AI Summary - The agent is used primarily for productivity-related tasks (36% of all queries), followed by learning, media, and shopping. [3]
- Research, document editing, and shopping-related tasks appear consistently across occupation clusters. [3]
- Knowledge-intensive sectors like digital technology, entrepreneurship, finance, and academia tend to use the agent for research and learning-related tasks. [3]
- Productivity and learning topics are the most sticky, while travel is the least sticky. [2]
- Users' first queries often fall into productivity, learning, or media topics, but over time, there's a shift towards more cognitively oriented use cases. [1]
Research Automation with AI
Peking University
Abstract
We investigate how large language models can be used as research tools in scientific computing while preserving mathematical rigor. We propose a human-in-the-loop workflow for interactive theorem proving and discovery with LLMs. Human experts retain control over problem formulation and admissible assumptions, while the model searches for proofs or contradictions, proposes candidate properties and theorems, and helps construct structures and parameters that satisfy explicit constraints, supported by numerical experiments and simple verification checks. Experts treat these outputs as raw material, further refine them, and organize the results into precise statements and rigorous proofs. We instantiate this workflow in a case study on the connection between manifold optimization and Grover's quantum search algorithm, where the pipeline helps identify invariant subspaces, explore Grover-compatible retractions, and obtain convergence guarantees for the retraction-based gradient method. The framework provides a practical template for integrating large language models into frontier mathematical research, enabling faster exploration of proof space and algorithm design while maintaining transparent reasoning responsibilities. Although illustrated on manifold optimization problems in quantum computing, the principles extend to other core areas of scientific computing.
AI Summary - Previous research has shown that human-AI collaboration can improve performance in various tasks, including theorem discovery and proof verification. [3]
- The collaboration between human experts and an LLM is organized into three stages, starting from an informal conjecture and ending with a precise theorem and proof. [2]
- Human-AI collaboration can significantly improve mathematical proof and theorem discovery. [1]
German Cancer Research
Abstract
Developing generalizable AI for medical imaging requires both access to large, multi-center datasets and standardized, reproducible tooling within research environments. However, leveraging real-world imaging data in clinical research environments is still hampered by strict regulatory constraints, fragmented software infrastructure, and the challenges inherent in conducting large-cohort multicentre studies. This leads to projects that rely on ad-hoc toolchains that are hard to reproduce, difficult to scale beyond single institutions and poorly suited for collaboration between clinicians and data scientists. We present Kaapana, a comprehensive open-source platform for medical imaging research that is designed to bridge this gap. Rather than building single-use, site-specific tooling, Kaapana provides a modular, extensible framework that unifies data ingestion, cohort curation, processing workflows and result inspection under a common user interface. By bringing the algorithm to the data, it enables institutions to keep control over their sensitive data while still participating in distributed experimentation and model development. By integrating flexible workflow orchestration with user-facing applications for researchers, Kaapana reduces technical overhead, improves reproducibility and enables conducting large-scale, collaborative, multi-centre imaging studies. We describe the core concepts of the platform and illustrate how they can support diverse use cases, from local prototyping to nation-wide research networks. The open-source codebase is available at https://github.com/kaapana/kaapana
AGI: Artificial General Intelligence
Meta
Abstract
Real-world AI software engineering demands coding agents that can reason over massive repositories, maintain durable memory across and within long sessions, and robustly coordinate complex toolchains at test time. Existing open-source coding agents provide transparency but frequently fall short when pushed to these industrial-scale workloads, while proprietary coding agents offer strong practical performance but limited extensibility, interpretability, and controllability. We present the Confucius Code Agent (CCA), an open-sourced AI software engineer that can operate at an industrial scale. CCA is built atop the Confucius SDK, an open-sourced agent development platform designed around three complementary perspectives: Agent Experience (AX), User Experience (UX), and Developer Experience (DX). The SDK introduces a unified orchestrator with hierarchical working memory for long-context reasoning, a persistent note-taking system for cross-session continual learning, and a modular extension module for robust tool use. Moreover, a meta-agent automates the synthesis, evaluation, and refinement of agent configurations through a build-test-improve loop, enabling rapid agent development on new tasks, environments, and tool stacks. Instantiated on Confucius SDK with these mechanisms, CCA delivers strong performance on real-world software engineering tasks. On SWE-Bench-Pro, CCA achieves a state-of-the-art Resolve@1 performance of 54.3%, substantially improving over prior coding agents. Together, the Confucius SDK and CCA provide a transparent, extensible, and reproducible foundation for AI agents, bridge gaps between research prototypes and production-grade systems, and support agent development and deployment at industrial scale.
AI Summary - SWE-Bench: a comprehensive benchmark to evaluate autonomous code-writing and code-fixing agents on realistic tasks. [3]
- The combination of monorepo development and LLM-based tools like ECO underscores a trend toward holistic scale: treating an entire organization’s code as a single evolvable system, with AI agents providing the intelligence to manage global changes, dependency analysis, and performance tuning in ways humans alone could not easily scale. [2]
- Large-scale software engineering has driven interest in AI assistance for code discovery, understanding, and consistent changes at scale. [1]
Abstract
Foundation models (FMs) are increasingly assuming the role of the "brain" of AI agents. While recent efforts have begun to equip FMs with native single-agent abilities -- such as GUI interaction or integrated tool use -- we argue that the next frontier is endowing FMs with native multi-agent intelligence. We identify four core capabilities of FMs in multi-agent contexts: understanding, planning, efficient communication, and adaptation. Contrary to assumptions about the spontaneous emergence of such abilities, we provide extensive empirical evidence across 41 large language models showing that strong single-agent performance alone does not automatically yield robust multi-agent intelligence. To address this gap, we outline key research directions -- spanning dataset construction, evaluation, training paradigms, and safety considerations -- for building FMs with native multi-agent intelligence.
Deep Learning
Universidad de Guanajuato
Abstract
This document reports the sequence of practices and methodologies implemented during the Big Data course. It details the workflow beginning with the processing of the Epsilon dataset through group and individual strategies, followed by text analysis and classification with RestMex and movie feature analysis with IMDb. Finally, it describes the technical implementation of a distributed computing cluster with Apache Spark on Linux using Scala.
AI Summary - In the big data era, data completeness can be as important as algorithm sophistication. [3]
- Big Data Analytics Distributed Computing Scalability Algorithm Sophistication Data Completeness The chronological progression demonstrates that mastering big data requires a systematic approach. [3]
- The choice between local and distributed architectures is not merely about computational resources, but about the quality and completeness of the data available to the model. [2]
National University of
Abstract
Accurate forecasting of urban air pollution is essential for protecting public health and guiding mitigation policies. While Deep Learning (DL) and hybrid pipelines dominate recent research, their complexity and limited interpretability hinder operational use. This study investigates whether lightweight additive models -- Facebook Prophet (FBP) and NeuralProphet (NP) -- can deliver competitive forecasts for particulate matter (PM$_{2.5}$, PM$_{10}$) in Beijing, China. Using multi-year pollutant and meteorological data, we applied systematic feature selection (correlation, mutual information, mRMR), leakage-safe scaling, and chronological data splits. Both models were trained with pollutant and precursor regressors, with NP additionally leveraging lagged dependencies. For context, two machine learning baselines (LSTM, LightGBM) and one traditional statistical model (SARIMAX) were also implemented. Performance was evaluated on a 7-day holdout using MAE, RMSE, and $R^2$. Results show that FBP consistently outperformed NP, SARIMAX, and the learning-based baselines, achieving test $R^2$ above 0.94 for both pollutants. These findings demonstrate that interpretable additive models remain competitive with both traditional and complex approaches, offering a practical balance of accuracy, transparency, and ease of deployment.
AI Summary - The study also explores the impact of different input features on the performance of the models and finds that using both air quality index and weather data improves the predictive power of the models. [3]
- AQI: Air Quality Index MAE: Mean Absolute Error The study demonstrates the effectiveness of machine learning models in predicting AQIs and highlights the importance of using both air quality index and weather data for improved predictive power. [3]
- The results of this study can be used to inform policy decisions related to air pollution control and mitigation strategies. [3]
- The study only evaluates the performance of different models on a single dataset and does not explore the generalizability of the results to other locations or datasets. [3]
- The authors do not provide any discussion on the limitations of the study, such as the potential impact of data quality issues or the lack of consideration for non-linear relationships between input features. [3]
- The paper presents a comparative study of various machine learning models for predicting air quality indices (AQIs) in Beijing, China. [2]
- The results show that the Prophet model outperforms other models in terms of accuracy, with a mean absolute error (MAE) of 4.35 μg/m³. [1]