Gran Sasso Science Institute
Abstract
LLMs are increasingly used to boost productivity and support software engineering tasks. However, when applied to socially sensitive decisions such as team composition and task allocation, they raise concerns of fairness. Prior studies have revealed that LLMs may reproduce stereotypes; however, these analyses remain exploratory and examine sensitive attributes in isolation. This study investigates whether LLMs exhibit bias in team composition and task assignment by analyzing the combined effects of candidates' country and pronouns. Using three LLMs and 3,000 simulated decisions, we find systematic disparities: demographic attributes significantly shaped both selection likelihood and task allocation, even when accounting for expertise-related factors. Task distributions further reflected stereotypes, with technical and leadership roles unevenly assigned across groups. Our findings indicate that LLMs exacerbate demographic inequities in software engineering contexts, underscoring the need for fairness-aware assessment.
Why we are recommending this paper?
Due to your Interest in Managing tech teams
This paper directly addresses the management of teams, particularly within the context of AI-driven systems, aligning with your interest in managing data science teams. Understanding potential biases in LLM-generated team structures is crucial for effective oversight and decision-making.
Honda Research Institute Europe GmbH
Abstract
Large language models (LLMs) are increasingly used as coding partners, yet their role in accelerating scientific discovery remains underexplored. This paper presents a case study of using ChatGPT for rapid prototyping in ESA's ELOPE (Event-based Lunar OPtical flow Egomotion estimation) competition. The competition required participants to process event camera data to estimate lunar lander trajectories. Despite joining late, we achieved second place with a score of 0.01282, highlighting the potential of human-AI collaboration in competitive scientific settings. ChatGPT contributed not only executable code but also algorithmic reasoning, data handling routines, and methodological suggestions, such as using fixed number of events instead of fixed time spans for windowing. At the same time, we observed limitations: the model often introduced unnecessary structural changes, gets confused by intermediate discussions about alternative ideas, occasionally produced critical errors and forgets important aspects in longer scientific discussions. By analyzing these strengths and shortcomings, we show how conversational AI can both accelerate development and support conceptual insight in scientific research. We argue that structured integration of LLMs into the scientific workflow can enhance rapid prototyping by proposing best practices for AI-assisted scientific work.
Why we are recommending this paper?
Due to your Interest in AI for Data Science Management
This case study explores the application of LLMs in accelerating scientific discovery, a key area of interest given your focus on AI for data science. The use of ChatGPT for prototyping aligns with the need to rapidly test and validate AI-powered solutions.
Prince Sultan University
AI Insights - The paper discusses the concept of agentic AI, which refers to AI systems that can make decisions and take actions on their own without human intervention. [3]
- Autonomous agent: An AI system that can operate independently and make decisions based on its internal state and external environment. [3]
- Agentic AI is a rapidly growing field with various applications, including scientific discovery, language translation, and task management. [2]
Abstract
The evolution of Large Language Models (LLMs) from passive text generators to autonomous, goal-driven systems represents a fundamental shift in artificial intelligence. This chapter examines the emergence of agentic AI systems that integrate planning, memory, tool use, and iterative reasoning to operate autonomously in complex environments. We trace the architectural progression from statistical models to transformer-based systems, identifying capabilities that enable agentic behavior: long-range reasoning, contextual awareness, and adaptive decision-making. The chapter provides three contributions: (1) a synthesis of how LLM capabilities extend toward agency through reasoning-action-reflection loops; (2) an integrative framework describing core components perception, memory, planning, and tool execution that bridge LLMs with autonomous behavior; (3) a critical assessment of applications and persistent challenges in safety, alignment, reliability, and sustainability. Unlike existing surveys, we focus on the architectural transition from language understanding to autonomous action, emphasizing the technical gaps that must be resolved before deployment. We identify critical research priorities, including verifiable planning, scalable multi-agent coordination, persistent memory architectures, and governance frameworks. Responsible advancement requires simultaneous progress in technical robustness, interpretability, and ethical safeguards to realize potential while mitigating risks of misalignment and unintended consequences.
Why we are recommending this paper?
Due to your Interest in AI for Data Science Engineering
This paper examines the evolution of autonomous AI systems, directly relating to the development of intelligent teams and the management of complex, goal-oriented AI systems. It’s a foundational exploration of the future of AI in team environments.
University of Wrzburg
Abstract
Large language models (LLMs) have achieved notable performance in code synthesis; however, data-aware augmentation remains a limiting factor, handled via heuristic design or brute-force approaches. We introduce a performance-aware, closed-loop solution in the NNGPT ecosystem of projects that enables LLMs to autonomously engineer optimal transformations by internalizing empirical performance cues. We fine-tune LLMs with Low-Rank Adaptation on a novel repository of more than 6,000 empirically evaluated PyTorch augmentation functions, each annotated solely by downstream model accuracy. Training uses pairwise performance ordering (better-worse transformations), enabling alignment through empirical feedback without reinforcement learning, reward models, or symbolic objectives. This reduces the need for exhaustive search, achieving up to 600x times fewer evaluated candidates than brute-force discovery while maintaining competitive peak accuracy and shifting generation from random synthesis to task-aligned design. Ablation studies show that structured Chain-of-Thought prompting introduces syntactic noise and degrades performance, whereas direct prompting ensures stable optimization in performance-critical code tasks. Qualitative and quantitative analyses demonstrate that the model internalizes semantic performance cues rather than memorizing syntax. These results show that LLMs can exhibit task-level reasoning through non-textual feedback loops, bypassing explicit symbolic rewards.
Why we are recommending this paper?
Due to your Interest in Data Science Engineering Management
This research focuses on leveraging LLMs for data transformation, a critical task within data science engineering management. The performance-guided approach suggests a structured method for optimizing data workflows, which is relevant to your interests.
Abstract
This work presents DCIM 3.0, a unified framework integrating semantic reasoning, predictive analytics, autonomous orchestration, and unified connectivity for next-generation AI data center management. The framework addresses critical challenges in infrastructure automation, sustainability, and digital-twin design through knowledge graph-based intelligence, thermal modeling, and the Unified Device Connectivity Protocol (UDCP).Keywords-Data Center Infrastructure Management, DCIM, AI Data Centers, Knowledge Graphs, Digital Twin, Thermal Management, Infrastructure Automation, Sustainability, GPU Computing, Data Center
Why we are recommending this paper?
Due to your Interest in AI for Data Science Management
This paper presents a framework for managing AI data centers, addressing the infrastructural challenges associated with scaling AI systems – a vital consideration for effective team management and engineering.
Michigan State University
Abstract
Designing scientific instrumentation often requires exploring large, highly constrained design spaces using computationally expensive physics simulations. These simulators pose substantial challenges for integrating evolutionary computation (EC) into scientific design workflows. Evolutionary computation typically requires numerous design evaluations, making the integration of slow, low-throughput simulators particularly challenging, as they are optimized for accuracy and ease of use rather than throughput. We present ECLIPSE, an evolutionary computation framework built to interface directly with complex, domain-specific simulation tools while supporting flexible geometric and parametric representations of scientific hardware. ECLIPSE provides a modular architecture consisting of (1) Individuals, which encode hardware designs using domain-aware, physically constrained representations; (2) Evaluators, which prepare simulation inputs, invoke external simulators, and translate the simulator's outputs into fitness measures; and (3) Evolvers, which implement EC algorithms suitable for high-cost, limited-throughput environments. We demonstrate the utility of ECLIPSE across several active space-science applications, including evolved 3D antennas and spacecraft geometries optimized for drag reduction in very low Earth orbit. We further discuss the practical challenges encountered when coupling EC with scientific simulation workflows, including interoperability constraints, parallelization limits, and extreme evaluation costs, and outline ongoing efforts to combat these challenges. ECLIPSE enables interdisciplinary teams of physicists, engineers, and EC researchers to collaboratively explore unconventional designs for scientific hardware while leveraging existing domain-specific simulation software.
Why we are recommending this paper?
Due to your Interest in Data Science Engineering