Papers from 22 to 26 September, 2025

Here are the personalized paper recommendations sorted by most relevant

Data Science Engineering

A decentralized future for the open-science databases

SageBionetworks, OregonHe

Rate this image: 😍 👍 👎

Abstract
Continuous and reliable access to curated biological data repositories is indispensable for accelerating rigorous scientific inquiry and fostering reproducible research. Centralized repositories, though widely used, are vulnerable to single points of failure arising from cyberattacks, technical faults, natural disasters, or funding and political uncertainties. This can lead to widespread data unavailability, data loss, integrity compromises, and substantial delays in critical research, ultimately impeding scientific progress. Centralizing essential scientific resources in a single geopolitical or institutional hub is inherently dangerous, as any disruption can paralyze diverse ongoing research. The rapid acceleration of data generation, combined with an increasingly volatile global landscape, necessitates a critical re-evaluation of the sustainability of centralized models. Implementing federated and decentralized architectures presents a compelling and future-oriented pathway to substantially strengthen the resilience of scientific data infrastructures, thereby mitigating vulnerabilities and ensuring the long-term integrity of data. Here, we examine the structural limitations of centralized repositories, evaluate federated and decentralized models, and propose a hybrid framework for resilient, FAIR, and sustainable scientific data stewardship. Such an approach offers a significant reduction in exposure to governance instability, infrastructural fragility, and funding volatility, and also fosters fairness and global accessibility. The future of open science depends on integrating these complementary approaches to establish a globally distributed, economically sustainable, and institutionally robust infrastructure that safeguards scientific data as a public good, further ensuring continued accessibility, interoperability, and preservation for generations to come.

AI Insights

EOSC’s federated nodes already host 1 million genomes, a living model of distributed stewardship.
ELIXIR’s COVID‑19 response proved community pipelines can scale to pandemic‑grade data volumes.
The Global Biodata Coalition’s roadmap envisions a cross‑border mesh that outpaces single‑point failure risks.
DeSci employs blockchain provenance to give researchers immutable audit trails for every dataset.
NIH’s Final Data Policy now mandates FAIR compliance, nudging institutions toward hybrid decentralized architectures.
DeSci still struggles with interoperability, as heterogeneous metadata schemas block seamless cross‑platform queries.
Privacy‑by‑design in distributed repositories remains a top research gap, inviting novel cryptographic solutions.

👍 👎 ♥ Save

Not Everyone Wins with LLMs: Behavioral Patterns and Pedagogical Implications in AI-assisted Data Analysis

Abstract
LLMs promise to democratize technical work in complex domains like programmatic data analysis, but not everyone benefits equally. We study how students with varied expertise use LLMs to complete Python-based data analysis in computational notebooks in a non-major course. Drawing on homework logs, recordings, and surveys from 36 students, we ask: Which expertise matters most, and how does it shape AI use? Our mixed-methods analysis shows that technical expertise -- not AI familiarity or communication skills -- remains a significant predictor of success. Students also vary widely in how they leverage LLMs, struggling at stages of forming intent, expressing inputs, interpreting outputs, and assessing results. We identify success and failure behaviors, such as providing context or decomposing prompts, that distinguish effective use. These findings inform AI literacy interventions, highlighting that lightweight demonstrations improve surface fluency but are insufficient; deeper training and scaffolds are needed to cultivate resilient AI use skills.

Engineering Management

👍 👎 ♥ Save

Artificial Intelligence-Powered Assessment Framework for Skill-Oriented Engineering Lab Education

Rate this image: 😍 👍 👎

Abstract
Practical lab education in computer science often faces challenges such as plagiarism, lack of proper lab records, unstructured lab conduction, inadequate execution and assessment, limited practical learning, low student engagement, and absence of progress tracking for both students and faculties, resulting in graduates with insufficient hands-on skills. In this paper, we introduce AsseslyAI, which addresses these challenges through online lab allocation, a unique lab problem for each student, AI-proctored viva evaluations, and gamified simulators to enhance engagement and conceptual mastery. While existing platforms generate questions based on topics, our framework fine-tunes on a 10k+ question-answer dataset built from AI/ML lab questions to dynamically generate diverse, code-rich assessments. Validation metrics show high question-answer similarity, ensuring accurate answers and non-repetitive questions. By unifying dataset-driven question generation, adaptive difficulty, plagiarism resistance, and evaluation in a single pipeline, our framework advances beyond traditional automated grading tools and offers a scalable path to produce genuinely skilled graduates.

👍 👎 ♥ Save

EngiBench: A Benchmark for Evaluating Large Language Models on Engineering Problem Solving

Nanyang Technological Unv

Abstract
Large language models (LLMs) have shown strong performance on mathematical reasoning under well-posed conditions. However, real-world engineering problems require more than mathematical symbolic computation -- they need to deal with uncertainty, context, and open-ended scenarios. Existing benchmarks fail to capture these complexities. We introduce EngiBench, a hierarchical benchmark designed to evaluate LLMs on solving engineering problems. It spans three levels of increasing difficulty (foundational knowledge retrieval, multi-step contextual reasoning, and open-ended modeling) and covers diverse engineering subfields. To facilitate a deeper understanding of model performance, we systematically rewrite each problem into three controlled variants (perturbed, knowledge-enhanced, and math abstraction), enabling us to separately evaluate the model's robustness, domain-specific knowledge, and mathematical reasoning abilities. Experiment results reveal a clear performance gap across levels: models struggle more as tasks get harder, perform worse when problems are slightly changed, and fall far behind human experts on the high-level engineering tasks. These findings reveal that current LLMs still lack the high-level reasoning needed for real-world engineering, highlighting the need for future models with deeper and more reliable problem-solving capabilities. Our source code and data are available at https://github.com/EngiBench/EngiBench.

AI Insights

Mixtral‑8x7B achieves 94.1 % mean accuracy on EngiBench, beating GPT‑4 and Claude‑2.
EngiBench’s perturbation, knowledge‑enhancement, and math‑abstraction variants isolate robustness, domain knowledge, and symbolic reasoning.
Models that excel at foundational retrieval collapse on open‑ended modeling tasks, revealing a steep performance cliff.
Civil‑engineering problems score 5 % lower than electrical‑engineering ones, highlighting subfield‑specific gaps.
The GitHub repo (github.com/EngiBench/EngiBench) hosts full problem sets, variant scripts, and evaluation code.
Suggested reading: “A Survey of Deep Learning for Engineering Applications” and “Robustness and Generalization in Deep Learning: A Review.”
EngiBench encourages adding domain‑specific sub‑levels to enrich the evaluation ecosystem.

Managing teams of data scientists

👍 👎 ♥ Save

Not a Collaborator or a Supervisor, but an Assistant: Striking the Balance Between Efficiency and Ownership in AI-incorporated Qualitative Data Analysis

Rochester Institute of T

Abstract
Qualitative research offers deep insights into human experiences, but its processes, such as coding and thematic analysis, are time-intensive and laborious. Recent advancements in qualitative data analysis (QDA) tools have introduced AI capabilities, allowing researchers to handle large datasets and automate labor-intensive tasks. However, qualitative researchers have expressed concerns about AI's lack of contextual understanding and its potential to overshadow the collaborative and interpretive nature of their work. This study investigates researchers' preferences among three degrees of delegation of AI in QDA (human-only, human-initiated, and AI-initiated coding) and explores factors influencing these preferences. Through interviews with 16 qualitative researchers, we identified efficiency, ownership, and trust as essential factors in determining the desired degree of delegation. Our findings highlight researchers' openness to AI as a supportive tool while emphasizing the importance of human oversight and transparency in automation. Based on the results, we discuss three factors of trust in AI for QDA and potential ways to strengthen collaborative efforts in QDA and decrease bias during analysis.

AI for Data Science Management

👍 👎 ♥ Save

AutoClimDS: Climate Data Science Agentic AI -- A Knowledge Graph is All You Need

Rate this image: 😍 👍 👎

Abstract
Climate data science faces persistent barriers stemming from the fragmented nature of data sources, heterogeneous formats, and the steep technical expertise required to identify, acquire, and process datasets. These challenges limit participation, slow discovery, and reduce the reproducibility of scientific workflows. In this paper, we present a proof of concept for addressing these barriers through the integration of a curated knowledge graph (KG) with AI agents designed for cloud-native scientific workflows. The KG provides a unifying layer that organizes datasets, tools, and workflows, while AI agents -- powered by generative AI services -- enable natural language interaction, automated data access, and streamlined analysis. Together, these components drastically lower the technical threshold for engaging in climate data science, enabling non-specialist users to identify and analyze relevant datasets. By leveraging existing cloud-ready API data portals, we demonstrate that "a knowledge graph is all you need" to unlock scalable and agentic workflows for scientific inquiry. The open-source design of our system further supports community contributions, ensuring that the KG and associated tools can evolve as a shared commons. Our results illustrate a pathway toward democratizing access to climate data and establishing a reproducible, extensible framework for human--AI collaboration in scientific research.

Interests not found

We did not find any papers that match the below interests. Try other terms also consider if the content exists in arxiv.org.

Managing tech teams
Data Science Engineering Management
Data Science Management
AI for Data Science Engineering

You can edit or add more interests any time.

Help us improve your experience!

This project is on its early stages your feedback can be pivotal on the future of the project. Let us know what you think about this week's papers and suggestions!

Give Feedback

Unsubscribe from these updates