Hi!

Your personalized paper recommendations for 26 to 30 January, 2026.

Defining Operational Conditions for Safety-Critical AI-Based Systems from Data

German Aerospace Center DLR

Rate paper: 👍 👎 ♥ Save

AI Insights

Machine Learning: A type of artificial intelligence that enables systems to learn from data without being explicitly programmed. (ML: 0.98)👍👎
The decision-making algorithm may not be robust enough to handle complex scenarios or unexpected events. (ML: 0.96)👍👎
The method relies on accurate sensor data and environment models, which may not always be available or reliable. (ML: 0.95)👍👎
However, further research is needed to address the challenges of integrating machine learning with formal methods and ensuring the robustness of the decision-making algorithm. (ML: 0.93)👍👎
The paper discusses the challenges of defining operational conditions for safety-critical AI-based systems from data. (ML: 0.93)👍👎
Safety-Critical AI-Based Systems: Systems that require high levels of safety and reliability, such as autonomous vehicles. (ML: 0.92)👍👎
The method uses a probabilistic model to represent the uncertainty in sensor data and environment, and then applies a decision-making algorithm to determine safe operating conditions. (ML: 0.91)👍👎
A novel approach is proposed, which combines machine learning and formal methods to define operational design domains (ODDs) for autonomous vehicles. (ML: 0.89)👍👎
Operational Design Domain (ODD): A description of the physical space where an automated vehicle is intended to operate. (ML: 0.86)👍👎
The proposed method has the potential to improve the safety and efficiency of autonomous vehicles by providing a more accurate representation of operational design domains. (ML: 0.80)👍👎

Abstract
Artificial Intelligence (AI) has been on the rise in many domains, including numerous safety-critical applications. However, for complex systems found in the real world, or when data already exist, defining the underlying environmental conditions is extremely challenging. This often results in an incomplete description of the environment in which the AI-based system must operate. Nevertheless, this description, called the Operational Design Domain (ODD), is required in many domains for the certification of AI-based systems. Traditionally, the ODD is created in the early stages of the development process, drawing on sophisticated expert knowledge and related standards. This paper presents a novel Safety-by-Design method to a posteriori define the ODD from previously collected data using a multi-dimensional kernel-based representation. This approach is validated through both Monte Carlo methods and a real-world aviation use case for a future safety-critical collision-avoidance system. Moreover, by defining under what conditions two ODDs are equal, the paper shows that the data-driven ODD can equal the original, underlying hidden ODD of the data. Utilizing the novel, Safe-by-Design kernel-based ODD enables future certification of data-driven, safety-critical AI-based systems.

Why we are recommending this paper?
Due to your Interest in AI for Data Science Management

Given your interest in managing AI systems, this paper directly addresses the critical challenge of operationalizing AI, particularly in safety-critical contexts. Understanding the environmental conditions surrounding AI deployments is essential for effective team management and risk mitigation.

Future of Software Engineering Research: The SIGSOFT Perspective

ACM Association for Computing Machinery

Rate paper: 👍 👎 ♥ Save

AI Insights

The role of SE research is becoming more important due to the complexity, scale, and pervasiveness of modern software systems, including AI-intensive ones. (ML: 0.96)👍👎
SIGSOFT: Special Interest Group on Software Technology CARES: Community for Advancing Research in Software Engineering AI/ML: Artificial Intelligence/Machine Learning SE: Software Engineering CAPS: Conference Attendance and Participation Support (ML: 0.91)👍👎
The SE conference ecosystem is evolving, with larger conferences expanding and smaller ones contracting. (ML: 0.91)👍👎
SIGSOFT aims to increase its actions (money permitting) on outreach to countries from the Global South, including keeping the SIGSOFT Africa initiative active and attempting to replicate similar programs in other parts of the world. (ML: 0.88)👍👎
SIGSOFT is committed to supporting initiatives such as CARES and anti-harassment policies to ensure a welcoming environment at SE conferences. (ML: 0.86)👍👎
Hybrid poster presentations are being explored as a way to improve remote participation in conferences. (ML: 0.85)👍👎

Abstract
As software engineering conferences grow in size, rising costs and outdated formats are creating barriers to participation for many researchers. These barriers threaten the inclusivity and global diversity that have contributed to the success of the SE community. Based on survey data, we identify concrete actions the ACM Special Interest Group on Software Engineering (SIGSOFT) can take to address these challenges, including improving transparency around conference funding, experimenting with hybrid poster presentations, and expanding outreach to underrepresented regions. By implementing these changes, SIGSOFT can help ensure the software engineering community remains accessible and welcoming.

Why we are recommending this paper?
Due to your Interest in Data Science Engineering Management

As a manager of data science teams, this paper’s focus on inclusivity and barriers to research within the SE community is highly relevant to your interests in fostering a thriving engineering environment.

"ENERGY STAR" LLM-Enabled Software Engineering Tools

University of Colorado Colorado Springs UCCS

Rate paper: 👍 👎 ♥ Save

AI Insights

CodeCarbon: A library used for estimating and tracking carbon emissions from machine learning computing. (ML: 0.93)👍👎
The study's experimental results show that the impacts of RAG pipelines varied across the studied LLMs, with CodeLlama experiencing 25% faster inference times and substantial quality improvements. (ML: 0.92)👍👎
CodeLlama achieved 25% faster inference times and substantial quality improvements with RAG, while smaller models like GPT-2 showed mixed efficiency results despite modest energy savings. (ML: 0.91)👍👎
Prompt Engineering Techniques (PETs): The process of designing and optimizing prompts to improve the performance and energy efficiency of LLMs. (ML: 0.90)👍👎
Large Language Models (LLMs): Deep learning models that can understand, generate, and translate human language. (ML: 0.90)👍👎
The use of Retrieval-Augmented Generation (RAG) pipelines can reduce energy consumption in Large Language Model (LLM)-based code generation, but the impact varies across different LLM architectures. (ML: 0.88)👍👎
The study highlights the importance of well-designed prompts in reducing LLMs' energy consumption, with Rubei et al.'s findings confirmed that optimal prompt configurations can reduce energy usage by up to 99%. (ML: 0.87)👍👎
RAG can help smaller, more efficient models achieve competitive code generation quality, as demonstrated by GPT-2 on the Kaggle dataset matching DeepSeek Coder's performance while using approximately 3.5x less energy. (ML: 0.86)👍👎
Retrieval-Augmented Generation (RAG): A pipeline that combines retrieval and generation mechanisms to enhance the quality and efficiency of LLM-based code generation. (ML: 0.85)👍👎
There is no clear relationship between model size and achieving any RAG-based energy efficiency benefits, as only GPT-2 (the smallest in size) and CodeLlama showed energy reduction with RAG. (ML: 0.82)👍👎

Abstract
The discussion around AI-Engineering, that is, Software Engineering (SE) for AI-enabled Systems, cannot ignore a crucial class of software systems that are increasingly becoming AI-enhanced: Those used to enable or support the SE process, such as Computer-Aided SE (CASE) tools and Integrated Development Environments (IDEs). In this paper, we study the energy efficiency of these systems. As AI becomes seamlessly available in these tools and, in many cases, is active by default, we are entering a new era with significant implications for energy consumption patterns throughout the Software Development Lifecycle (SDLC). We focus on advanced Machine Learning (ML) capabilities provided by Large Language Models (LLMs). Our proposed approach combines Retrieval-Augmented Generation (RAG) with Prompt Engineering Techniques (PETs) to enhance both the quality and energy efficiency of LLM-based code generation. We present a comprehensive framework that measures real-time energy consumption and inference time across diverse model architectures ranging from 125M to 7B parameters, including GPT-2, CodeLlama, Qwen 2.5, and DeepSeek Coder. These LLMs, chosen for practical reasons, are sufficient to validate the core ideas and provide a proof of concept for more in-depth future analysis.

Why we are recommending this paper?
Due to your Interest in Data Science Engineering Management

This paper explores the intersection of AI and software engineering tools, aligning with your interest in managing teams utilizing AI-enhanced systems and optimizing development processes.

CM-GAI: Continuum Mechanistic Generative Artificial Intelligence Theory for Data Dynamics

Dalian University of Technology

Rate paper: 👍 👎 ♥ Save

Rate image: 👍 👎

Abstract
Generative artificial intelligence (GAI) plays a fundamental role in high-impact AI-based systems such as SORA and AlphaFold. Currently, GAI shows limited capability in the specialized domains due to data scarcity. In this paper, we develop a continuum mechanics-based theoretical framework to generalize the optimal transport theory from pure mathematics, which can be used to describe the dynamics of data, realizing the generative tasks with a small amount of data. The developed theory is used to solve three typical problem involved in many mechanical designs and engineering applications: at material level, how to generate the stress-strain response outside the range of experimental conditions based on experimentally measured stress-strain data; at structure level, how to generate the temperature-dependent stress fields under the thermal loading; at system level, how to generate the plastic strain fields under transient dynamic loading. Our results show the proposed theory can complete the generation successfully, showing its potential to solve many difficult problems involved in engineering applications, not limited to mechanics problems, such as image generation. The present work shows that mechanics can provide new tools for computer science. The limitation of the proposed theory is also discussed.

Why we are recommending this paper?
Due to your Interest in AI for Data Science Management

The development of theoretical frameworks for GAI, particularly concerning data dynamics, could provide valuable insights into managing and understanding the capabilities of AI-driven data science teams.

In-Network Collective Operations: Game Changer or Challenge for AI Workloads?

ETH Zrich

Rate paper: 👍 👎 ♥ Save

Rate image: 👍 👎

AI Insights

Technical complexity and integration into existing systems may pose challenges to the adoption of INC. (ML: 0.95)👍👎
The development of INC standards will be crucial for widespread adoption and interoperability among different products and approaches. (ML: 0.91)👍👎
The development of INC standards must be lean and straightforward to justify necessary investments. (ML: 0.90)👍👎
The INC working group is developing standards for INC, which will facilitate widespread adoption and interoperability among different products and approaches. (ML: 0.90)👍👎
In-Network Collective Operations (INC): A technology that enables efficient and scalable collective operations for distributed AI workloads by leveraging the network's processing capabilities. (ML: 0.88)👍👎
INC technologies have the potential to significantly benefit the adoption of distributed AI workloads by addressing communication challenges faced by local networks. (ML: 0.88)👍👎
In-Network Collective Operations (INC) is a technology that enables efficient and scalable collective operations for distributed AI workloads. (ML: 0.86)👍👎
Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis Scalable hierarchical aggregation protocol (SHArP): A hardware architecture for efficient data reduction (ML: 0.86)👍👎
Ultra Ethernet: An initiative led by a consortium working group to advance Ethernet for AI & HPC, with a focus on developing standards for INC. (ML: 0.68)👍👎

Abstract
This paper summarizes the opportunities of in-network collective operations (INC) for accelerated collective operations in AI workloads. We provide sufficient detail to make this important field accessible to non-experts in AI or networking, fostering a connection between these communities. Consider two types of INC: Edge-INC, where the system is implemented at the node level, and Core-INC, where the system is embedded within network switches. We outline the potential performance benefits as well as six key obstacles in the context of both Edge-INC and Core-INC that may hinder their adoption. Finally, we present a set of predictions for the future development and application of INC.

Why we are recommending this paper?
Due to your Interest in Managing teams of data scientists

Exploring collective operations within AI workloads aligns with your interest in optimizing team collaboration and leveraging distributed AI resources for enhanced productivity.

MoCo: A One-Stop Shop for Model Collaboration Research

University of Washington

Rate paper: 👍 👎 ♥ Save

Rate image: 👍 👎

AI Insights

Transfer learning: A technique where a pre-trained model is fine-tuned on a new task or dataset, rather than training from scratch. (ML: 0.97)👍👎
Model collaboration: The process of combining multiple language models to improve their performance and efficiency. (ML: 0.97)👍👎
The use of large-scale datasets and benchmarks will remain crucial for evaluating and comparing the performance of different models. (ML: 0.96)👍👎
Multi-task learning: A technique where a single model is trained on multiple tasks simultaneously, allowing it to learn shared representations and improve performance on each task. (ML: 0.96)👍👎
The use of large-scale datasets and benchmarks is becoming increasingly important for evaluating and comparing the performance of different models. (ML: 0.96)👍👎
Researchers are continuing to explore innovative approaches to improve the performance and efficiency of language models. (ML: 0.95)👍👎
Researchers are exploring various approaches to improve the performance and efficiency of language models, including ensemble methods, transfer learning, and multi-task learning. (ML: 0.94)👍👎
Ensemble methods: Techniques used to combine the predictions or outputs of multiple models to produce a more accurate result. (ML: 0.94)👍👎
The field of model collaboration is rapidly evolving with the development of new techniques and tools. (ML: 0.91)👍👎
The field of model collaboration has made significant progress in recent years, with the development of new techniques and tools. (ML: 0.89)👍👎

Abstract
Advancing beyond single monolithic language models (LMs), recent research increasingly recognizes the importance of model collaboration, where multiple LMs collaborate, compose, and complement each other. Existing research on this topic has mostly been disparate and disconnected, from different research communities, and lacks rigorous comparison. To consolidate existing research and establish model collaboration as a school of thought, we present MoCo: a one-stop Python library of executing, benchmarking, and comparing model collaboration algorithms at scale. MoCo features 26 model collaboration methods, spanning diverse levels of cross-model information exchange such as routing, text, logit, and model parameters. MoCo integrates 25 evaluation datasets spanning reasoning, QA, code, safety, and more, while users could flexibly bring their own data. Extensive experiments with MoCo demonstrate that most collaboration strategies outperform models without collaboration in 61.0% of (model, data) settings on average, with the most effective methods outperforming by up to 25.8%. We further analyze the scaling of model collaboration strategies, the training/inference efficiency of diverse methods, highlight that the collaborative system solves problems where single LMs struggle, and discuss future work in model collaboration, all made possible by MoCo. We envision MoCo as a valuable toolkit to facilitate and turbocharge the quest for an open, modular, decentralized, and collaborative AI future.

Why we are recommending this paper?
Due to your Interest in Managing teams of data scientists

Nonvisual Support for Understanding and Reasoning about Data Structures

University of Notre Dame

Rate paper: 👍 👎 ♥ Save

AI Insights

Participants may have had prior experience with computer science concepts, which could influence their performance in the study. (ML: 0.99)👍👎
The study suggests that providing multiple representations can support the learning of data structures by BVI individuals, but it is essential to consider individual differences in learning styles and preferences. (ML: 0.98)👍👎
The study found that BVI individuals use multiple representations to understand and reason about data structures in a way that is consistent with sighted individuals, but they may require more time and practice to develop their skills. (ML: 0.98)👍👎
The study aimed to investigate how blind or visually impaired (BVI) individuals use multiple representations to understand and reason about data structures, specifically arrays and binary trees. (ML: 0.96)👍👎
The participants used the tabular representation most frequently for arrays, while the navigable representation was preferred for binary trees. (ML: 0.95)👍👎
Data Structure: A way of organizing and storing data in a computer so that it can be efficiently accessed and modified. (ML: 0.94)👍👎
Limited sample size of 8 participants. (ML: 0.93)👍👎
Binary Tree: A hierarchical structure where each node has at most two children (left child and right child). (ML: 0.85)👍👎
Array: A linear collection of elements, each identified by an index. (ML: 0.82)👍👎
Blind or Visually Impaired (BVI): Individuals who are unable to see or have limited vision. (ML: 0.77)👍👎

Abstract
Blind and visually impaired (BVI) computer science students face systematic barriers when learning data structures: current accessibility approaches typically translate diagrams into alternative text, focusing on visual appearance rather than preserving the underlying structure essential for conceptual understanding. More accessible alternatives often do not scale in complexity, cost to produce, or both. Motivated by a recent shift to tools for creating visual diagrams from code, we propose a solution that automatically creates accessible representations from structural information about diagrams. Based on a Wizard-of-Oz study, we derive design requirements for an automated system, Arboretum, that compiles text-based diagram specifications into three synchronized nonvisual formats$\unicode{x2013}$tabular, navigable, and tactile. Our evaluation with BVI users highlights the strength of tactile graphics for complex tasks such as binary search; the benefits of offering multiple, complementary nonvisual representations; and limitations of existing digital navigation patterns for structural reasoning. This work reframes access to data structures by preserving their structural properties. The solution is a practical system to advance accessible CS education.

Why we are recommending this paper?
Due to your Interest in Data Science Engineering

Interests not found

We did not find any papers that match the below interests. Try other terms also consider if the content exists in arxiv.org.

Data Science Management
Managing tech teams
AI for Data Science Engineering
Engineering Management

You can edit or add more interests any time.

💬 Help Shape Our Pricing

We're exploring pricing options to make this project sustainable. Take 3 minutes to share what you'd be willing to pay (if anything). Your input guides our future investment.

Share Your Feedback

Help us improve your experience!

This project is on its early stages your feedback can be pivotal on the future of the project. Let us know what you think about this week's papers and suggestions!

Give Feedback