Hi!

Your personalized paper recommendations for 12 to 16 January, 2026.

Coordinated Cooling and Compute Management for AI Datacenters

University of Alberta

Rate paper: 👍 👎 ♥ Save

AI Insights

The proposed controller prevents both under-utilization (too many idle GPUs) and overload (insufficient capacity). [3]
The proposed hierarchical control framework integrates workload utilization, cluster-level provisioning, MILP-based frequency tuning, computing power consumption, and cooling regulation. [2]

Abstract
The AI datacenters are currently being deployed on a large scale to support the training and deployment of power-intensive large-language models (LLMs). Extensive amount of computation and cooling required in datacenters increase concerns about the energy use and carbon emissions of AI datacenters. Although current state-of-the-art has examined the energy efficiency of LLM inference, most prior research focused on optimizing compute-side scheduling without considering thermal objectives or constraints. Since GPU-intensive inference generates substantial heat that can degrade datacenter performance, ignoring thermal effects can increase total energy consumption and reduce the efficiency of LLM serving. To fill this gap, we profile the characteristics of GPU servers under varying cooling and AI jobs, and develop a joint cooling and computing modeling approach for AI datacenters. Built upon such workload and thermal dynamics models, a novel hierarchical control framework is proposed to co-optimize computing and thermal management by identifying the optimal GPU parallelism, frequency (DVFS), and cooling control knobs. Using real Azure inference traces and detailed GPU profiling, our model balances serving latency and thermal constraints in AI datacenters while significantly improving AI datacenters' energy efficiency.

Why we are recommending this paper?
Due to your Interest in AI for Data Science Management

This paper directly addresses the management of AI infrastructure, a key concern given your interest in managing data science teams and AI systems. The focus on energy efficiency and datacenter operations aligns strongly with your broader interests in optimizing tech teams and their impact.

TiInsight: A SQL-based Automated Exploratory Data Analysis System through Large Language Models

PingCAP

Rate paper: 👍 👎 ♥ Save

Rate image: 👍 👎

AI Insights

HDC generation involves extracting representative entities for each database to facilitate efficient data exploration across multiple databases. [3]
It also includes a self-refinement chain to correct errors in generated SQL statements. [3]
The system demonstrates its capabilities through two real-world scenarios: the Financial dataset and the Bird dataset, showcasing its ability to provide insights and facilitate user-system interaction. [3]
HDC: Hierarchical Data Context - a summary of the data that includes a description, keywords, table information, and more. [3]
TiChart: Chart Selection - a component that selects the most suitable chart type to present analysis results by visualization. [3]
Exploration Efficiency: The ability of the system to efficiently explore data across multiple databases. [3]
TiInsight is a SQL-based automated cross-domain exploratory data analysis system that utilizes large language models to facilitate user-system interaction and provide powerful hierarchical data context (HDC) generation, text-to-SQL (TiSQL), chart selection (TiChart), and exploration efficiency. [2]
TiSQL is a schema filtering framework based on the map-reduce paradigm that filters tables and columns using clarified questions and cosine similarity. [1]

Abstract
The SQL-based exploratory data analysis has garnered significant attention within the data analysis community. The emergence of large language models (LLMs) has facilitated the paradigm shift from manual to automated data exploration. However, existing methods generally lack the ability for cross-domain analysis, and the exploration of LLMs capabilities remains insufficient. This paper presents TiInsight, an SQL-based automated cross-domain exploratory data analysis system. First, TiInsight offers a user-friendly GUI enabling users to explore data using natural language queries. Second, TiInsight offers a robust cross-domain exploratory data analysis pipeline: hierarchical data context (i.e., HDC) generation, question clarification and decomposition, text-to-SQL (i.e., TiSQL), and data visualization (i.e., TiChart). Third, we have implemented and deployed TiInsight in the production environment of PingCAP and demonstrated its capabilities using representative datasets. The demo video is available at https://youtu.be/JzYFyYd-emI.

Why we are recommending this paper?
Due to your Interest in Data Science Management

This research explores the use of LLMs for automated data analysis, a technique relevant to streamlining data science workflows and improving team efficiency. The SQL-based approach is particularly valuable for managing and understanding complex datasets, aligning with your interest in data science engineering management.

Exploring Organizational Readiness and Ecosystem Coordination for Industrial XR

Technical University of Munich

Rate paper: 👍 👎 ♥ Save

AI Insights

Limited generalizability due to the small sample size. [3]
Organizational readiness has become the binding constraint for enterprise XR adoption, surpassing technology barriers. [2]

Abstract
Extended Reality (XR) offers transformative potential for industrial support, training, and maintenance; yet, widespread adoption lags despite demonstrated occupational value and hardware maturity. Organizations successfully implement XR in isolated pilots, yet struggle to scale these into sustained operational deployment, a phenomenon we characterize as the ``Pilot Trap.'' This study examines this phenomenon through a qualitative ecosystem analysis of 17 expert interviews across technology providers, solution integrators, and industrial adopters. We identify a ``Great Inversion'' in adoption barriers: critical constraints have shifted from technological maturity to organizational readiness (e.g., change management, key performance indicator alignment, and political resistance). While hardware ergonomics and usability remain relevant, our findings indicate that systemic misalignments between stakeholder incentives are the primary cause of friction preventing enterprise integration. We conclude that successful industrial XR adoption requires a shift from technology-centric piloting to a problem-first, organizational transformation approach, necessitating explicit ecosystem-level coordination.

Why we are recommending this paper?
Due to your Interest in Managing tech teams

The paper’s focus on scaling XR adoption within industrial settings is pertinent to managing tech teams implementing innovative solutions. Understanding the organizational challenges in deploying XR aligns with your interest in managing tech teams and their projects.

Technology Integration in the Project Based Learning Model: Bibliometric Analysis 2015-2024

Universitas Negeri Padang

Rate paper: 👍 👎 ♥ Save

Rate image: 👍 👎

AI Insights

The results showed that PBL with technology is still a trend in research. [3]
The analysis used bibliometric mapping and VOS Viewer software to visualize the relationships between variables studied by researchers. [3]
The study suggests that PBL with technology is still a popular area of research, but there is room for improvement in determining student success and motivation. [3]
The study analyzed the development of project-based learning (PBL) research by integrating technology from 2015 to 2024. [2]

Abstract
This study aims to conduct a bibliometric analysis of scientific publications discussing technology integration in the Project-Based Learning (PjBL) model during the 2015-2024 period. With the evolving 21st century educational paradigm, PjBL has emerged as one of the most promising pedagogical approaches. In this context, the integration of technology in PjBL becomes an important catalyst that expands the scope and depth of learning experiences while preparing students to face challenges in the digital era. This research applies a bibliometric analysis methodology and systematic literature review, utilizing the Publish or Perish (POP) application for data collection and VOSviewer for data analysis and visualization. The results reveal emerging research trends, collaboration patterns, and focus areas in the scientific literature related to technology integration in PjBL. The bibliometric analysis uncovers research dynamics and knowledge gaps that need to be addressed. These findings provide valuable insights for researchers, practitioners, and policymakers in optimizing the implementation of technology-based PjBL and formulating educational policies responsive to technological developments and 21st-century learning needs. The results of this study are expected to contribute significantly to the academic community and educational practitioners in developing more effective technology based PjBL implementation strategies and fostering interdisciplinary collaboration to advance education in the digital age.

Why we are recommending this paper?
Due to your Interest in Managing tech teams

This paper investigates technology integration within a specific educational model, offering insights into evolving pedagogical approaches. Given your interest in managing data science teams, this could inform strategies for integrating new technologies into training and development programs.

A Sustainable AI Economy Needs Data Deals That Work for Generators

Virginia Tech

Rate paper: 👍 👎 ♥ Save

AI Insights

The paper discusses the challenges of dataset licensing and attribution in AI research, highlighting the need for more transparent and equitable practices. [3]
Attribution: The act of acknowledging the source of a dataset or model used in AI research. [3]
The paper assumes that all datasets are available for use, which may not be the case in practice. [3]
The authors propose a framework for optimal data selection from multiple sources, which can improve performance scaling and reduce computational costs. [2]

Abstract
We argue that the machine learning value chain is structurally unsustainable due to an economic data processing inequality: each state in the data cycle from inputs to model weights to synthetic outputs refines technical signal but strips economic equity from data generators. We show, by analyzing seventy-three public data deals, that the majority of value accrues to aggregators, with documented creator royalties rounding to zero and widespread opacity of deal terms. This is not just an economic welfare concern: as data and its derivatives become economic assets, the feedback loop that sustains current learning algorithms is at risk. We identify three structural faults - missing provenance, asymmetric bargaining power, and non-dynamic pricing - as the operational machinery of this inequality. In our analysis, we trace these problems along the machine learning value chain and propose an Equitable Data-Value Exchange (EDVEX) Framework to enable a minimal market that benefits all participants. Finally, we outline research directions where our community can make concrete contributions to data deals and contextualize our position with related and orthogonal viewpoints.

Why we are recommending this paper?
Due to your Interest in AI for Data Science Management

This paper addresses the critical issue of data equity within the AI ecosystem, a factor increasingly relevant to responsible data science management. Understanding the economic implications of data access aligns with your broader interest in AI for Data Science Management.

Field report from Collaborative Research Center 1625: Heterogeneous research data management using ontology representations

RuhrUniversitt Bochum

Rate paper: 👍 👎 ♥ Save

Rate image: 👍 👎

AI Insights

The RDMS is designed to be highly usable, with a focus on reducing the time and effort required for researchers to manage their data. [2]
Usability testing is conducted regularly through online meetings, with participants completing tasks using the RDMS platform and providing feedback. [1]

Abstract
The goal of the Collaborative Research Center 1625 is the establishment of a scientific basis for the atomic-scale understanding and design of multifunctional compositionally complex solid solution surfaces. Next to materials synthesis in form of thin-film materials libraries, various materials characterization and simulations techniques are used to explore the materials data space of the problem. Machine learning and artificial intelligence techniques guide its exploration and navigation. The effective use of the combined heterogeneous data requires more than just a simple research data management plan. Consequently, our research data management system maps different data modalities in different formats and resolutions from different labs to the correct spatial locations on physical samples. Besides a graphical user interface, the system can also be accessed through an application programming interface for reproducible data-driven workflows. It is implemented by a combination of a custom research data management system designed around a relational database, an ontology which builds upon materials science-specific ontologies, and the construction of a Knowledge Graph. Along with the technical solutions of research data management system and lessons learned, first use cases are shown which were not possible (or at least much harder to achieve) without it.

Why we are recommending this paper?
Due to your Interest in Managing teams of data scientists

Unleashing Tool Engineering and Intelligence for Agentic AI in Next-Generation Communication Networks

Nanyang Technological University

Rate paper: 👍 👎 ♥ Save

AI Insights

Tool intelligence has the potential to empower agentic AI for next-generation communication networks. [3]
Agentic AI: Artificial intelligence that can make decisions and take actions based on its own goals and intentions. [3]
The transformative potential of tool intelligence in empowering agentic AI for next-generation communication networks was explored. [3]
A case study on tool-assisted Unmanned Aerial Vehicle (UA V) trajectory planning was conducted, optimizing the agent's ability to activate tools while adhering to strict energy constraints. [2]
The lifecycle of tool engineering includes creation and discovery, selection, learning, and benchmarking. [1]

Abstract
Nowadays, agentic AI is emerging as a transformative paradigm for next-generation communication networks, promising to evolve large language models (LLMs) from passive chatbots into autonomous operators. However, unleashing this potential requires bridging the critical gap between abstract reasoning and physical actuation, a capability we term tool intelligence. In this article, we explore the landscape of tool engineering to empower agentic AI in communications. We first analyze the functionalities of tool intelligence and its effects on communications. We then propose a systematic review for tool engineering, covering the entire lifecycle from tool creation and discovery to selection, learning, and benchmarking. Furthermore, we present a case study on tool-assisted uncrewed aerial vehicles (UAV) trajectory planning to demonstrate the realization of tool intelligence in communications. By introducing a teacher-guided reinforcement learning approach with a feasibility shield, we enable agents to intelligently operate tools. They utilize external tools to eliminate navigational uncertainty while mastering cost-aware scheduling under strict energy constraints. This article aims to provide a roadmap for building the tool-augmented intelligent agents of the 6G era.

Why we are recommending this paper?
Due to your Interest in AI for Data Science Engineering

Towards a Metadata Schema for Energy Research Software

Carl von Ossietzky Universitt Oldenburg

Rate paper: 👍 👎 ♥ Save

AI Insights

The evaluation was conducted using a questionnaire, which may not be representative of all users. [3]
The paper discusses the evaluation of a metadata schema for energy research software, with a focus on its usability and effectiveness in improving findability. [2]
The paper discusses several existing metadata schemes for research software, including CodeMeta, OntoSoft, and BiotoolsSchema. [1]

Abstract
Domain-specific metadata schemas are essential to improve the findability and reusability of research software and to follow the FAIR4RS principles. However, many domains, including energy research, lack established metadata schemas. To address this gap, we developed a metadata schema for energy research software based on a requirement analysis and evaluated it through user testing. Our results show that the schema balances the need for formalization and interoperability, while also meeting the specific needs of energy researchers. Meanwhile, the testing showed that a good presentation of the required information is key to enable researchers to create the required metadata. This paper provides insights into the challenges and opportunities of designing a metadata schema for energy research software.

Why we are recommending this paper?
Due to your Interest in Data Science Engineering

Interests not found

We did not find any papers that match the below interests. Try other terms also consider if the content exists in arxiv.org.

Data Science Engineering Management
Engineering Management

You can edit or add more interests any time.

💬 Help Shape Our Pricing

We're exploring pricing options to make this project sustainable. Take 3 minutes to share what you'd be willing to pay (if anything). Your input guides our future investment.

Share Your Feedback

Help us improve your experience!

This project is on its early stages your feedback can be pivotal on the future of the project. Let us know what you think about this week's papers and suggestions!

Give Feedback