Machine Learning Resilience

Robustness and resilience of dynamical networks in biology and epidemiology

University of Trento

Rate paper: 👍 👎 ♥ Save

Rate image: 👍 👎

Abstract
Natural systems are remarkably robust and resilient, maintaining essential functions despite variability, uncertainty, and hostile conditions. Understanding these nonlinear, dynamic behaviours is challenging because such systems involve many interacting parameters, yet it is crucial for explaining processes from cellular regulation to disease onset and epidemic spreading. Robustness and resilience describe a system's ability to preserve and recover desired behaviours in the presence of intrinsic and extrinsic fluctuations. This survey reviews how different disciplines define these concepts, examines methods for assessing whether key properties of uncertain, networked dynamical systems are structural (parameter-free) or robust (preserved for parameter variations within an uncertainty bounding set), and discusses integrated structural and probabilistic techniques for biological and epidemiological models. The text introduces formal definitions of resilience for families of systems obtained by adding stochastic perturbations to a nominal deterministic model, enabling a probabilistic characterisation of the ability to remain within or return to a prescribed attractor. These definitions generalise probabilistic robustness and shed new light on classical biological examples. In addition, the survey summarises resilience indicators and data-driven tools for detecting resilience loss and regime shifts, drawing on bifurcation analysis to anticipate qualitative changes in system behaviour. Together, these methodologies support the study and control of complex natural systems, guiding the design of biomolecular feedback architectures, the identification of therapeutic targets, the forecasting and management of epidemics, and the detection of tipping points in ecological and biological networks.

AI Summary

Graphs are versatile tools to describe complex interplay of interactions giving rise to biological and epidemiological systems. [3]
Hyper-graphs and hyper-edges represent flow graphs for chemical reactions and can be used to capture the topology of biological networks. [3]
Multi-scale models bridge the microscopic in-host scale and the macroscopic between-host scale, studying how immunological mechanisms affect epidemiological mechanisms. [2]
Epidemic models can be interpreted as chemical reaction systems using mass action kinetics. [1]

Studying Various Activation Functions and Non-IID Data for Machine Learning Model Robustness

University of South

Rate paper: 👍 👎 ♥ Save

Abstract
Adversarial training is an effective method to improve the machine learning (ML) model robustness. Most existing studies typically consider the Rectified linear unit (ReLU) activation function and centralized training environments. In this paper, we study the ML model robustness using ten different activation functions through adversarial training in centralized environments and explore the ML model robustness in federal learning environments. In the centralized environment, we first propose an advanced adversarial training approach to improving the ML model robustness by incorporating model architecture change, soft labeling, simplified data augmentation, and varying learning rates. Then, we conduct extensive experiments on ten well-known activation functions in addition to ReLU to better understand how they impact the ML model robustness. Furthermore, we extend the proposed adversarial training approach to the federal learning environment, where both independent and identically distributed (IID) and non-IID data settings are considered. Our proposed centralized adversarial training approach achieves a natural and robust accuracy of 77.08% and 67.96%, respectively on CIFAR-10 against the fast gradient sign attacks. Experiments on ten activation functions reveal ReLU usually performs best. In the federated learning environment, however, the robust accuracy decreases significantly, especially on non-IID data. To address the significant performance drop in the non-IID data case, we introduce data sharing and achieve the natural and robust accuracy of 70.09% and 54.79%, respectively, surpassing the CalFAT algorithm, when 40% data sharing is used. That is, a proper percentage of data sharing can significantly improve the ML model robustness, which is useful to some real-world applications.

AI Summary

The proposed federated adversarial training (AT) approach aims to improve the robustness of machine learning models against various attacks, including FGSM, C&W, and DeepFool. [3]
The approach involves augmenting local data with adversarial examples generated using PGD and Gaussian noise, and using soft labels instead of hard labels during the federated AT phase. [3]
The choice of activation function is critical in improving the ML model robustness through AT, and ReLU was found to be the best choice for maintaining both robust and natural accuracy. [3]
The proposed approach can support various activation functions without modification to the model architecture. [3]
The federated AT process begins with a federated AT phase followed by a test phase, where the global model is tested for its robustness against various adversarial attacks and noise. [3]
Gaussian Noise: Random noise generated using the Gaussian distribution. [3]
The effectiveness of the data sharing approach in increasing both natural and robust accuracy compared to the same non-IID data settings without data sharing is demonstrated. [2]

Machine Learning Testing

Evaluating A/B Testing Methodologies via Sample Splitting: Theory and Practice

Amazon

Rate paper: 👍 👎 ♥ Save

Abstract
We develop a theoretical framework for sample splitting in A/B testing environments, where data for each test are partitioned into two splits to measure methodological performance when the true impacts of tests are unobserved. We show that sample-split estimators are generally biased for full-sample performance but consistently estimate sample-split analogues of it. We derive their asymptotic distributions, construct valid confidence intervals, and characterize the bias-variance trade-offs underlying sample-split design choices. We validate our theoretical results through simulations and provide implementation guidance for A/B testing products seeking to evaluate new estimators and decision rules.

AI Summary

The authors demonstrate that the Bayes estimator relative to unbiased and the Bayes decision rule relative to frequentist estimator under squared error decision rule under launch-only decision value exhibit different behavior in terms of mean squared error and bias for ideal estimand. [3]
Average performance: The expected value of the sample-split estimator over multiple tests. [3]
Bias-variance tradeoff: The relationship between bias and variance in statistical estimation, where a decrease in one leads to an increase in the other. [3]
The results demonstrate the importance of considering both bias and variance in statistical estimation, as well as the tradeoffs involved in choosing the split fraction α. [3]
The paper presents a framework for estimating the average performance of a sample-split estimator in a multi-test setting. [2]

Robust hyperentanglement self testing

Nanjing University of A

Rate paper: 👍 👎 ♥ Save

Abstract
Hyperentanglement, which refers to entanglement encoded in two or more independent degrees of freedom (DOFs), is a valuable resource for the future high-capacity quantum network. Certifying hyperentanglement sources work as intended is critical for the hyperentanglement-based quantum information tasks. Self testing is the strongest certification method for quantum state and measurement under minimal assumptions, even without any knowledge of the devices' inner workings. However, the existing self testing protocols all focus on one-DOF entanglement, which cannot self test the multi-DOF entanglement. In the paper, we propose a hyperentanglement self testing framework. We take the self testing for the polarization-spatial-mode hyperentangled Bell states as an example. The self testing is based on the violation of two-dimension CHSH test in each DOF independently. The two-step swap isometry circuits are proposed for self testing the entanglement in spatial-mode and polarization DOFs, respectively. All the sixteen polarization-spatial-mode hyperentangled Bell states can be self tested. Our hyperentanglement self testing framework has three advantages. First, it is a general hyperentanglement self testing framework, and can be extended to self test multi-DOF hyperentanglement and multipartite hyperentanglement. Second, it can provide the robust hyperentanglement self testing and establish the relation between the lower bound of fidelity and the imperfect violation of Bell-like inequality in each DOF. Third, it is feasible with current experimental technology. Our hyperentanglement self testing framework provides a promising way to certify complex hyperentanglement sources, and has potential application in future high-capacity quantum network.

AI Summary

Quantum entanglement is a fundamental concept in quantum mechanics that describes the interconnectedness of particles. [3]
Quantum entanglement: The phenomenon where two or more particles become correlated in such a way that their properties cannot be described independently of each other. [3]
Device-independent certification: A process for certifying the properties of a quantum system based solely on its behavior in experiments, without requiring any information about the internal workings of the device. [3]
The complexity of self-testing methods can make them difficult to implement in practice. [2]
Self-testing is a method for verifying the properties of a quantum system without direct access to it, relying on correlations between measurement outcomes. [1]

Fault tolerance

Provably Safe Model Updates

Imperial College London

Rate paper: 👍 👎 ♥ Save

Rate image: 👍 👎

Abstract
Safety-critical environments are inherently dynamic. Distribution shifts, emerging vulnerabilities, and evolving requirements demand continuous updates to machine learning models. Yet even benign parameter updates can have unintended consequences, such as catastrophic forgetting in classical models or alignment drift in foundation models. Existing heuristic approaches (e.g., regularization, parameter isolation) can mitigate these effects but cannot certify that updated models continue to satisfy required performance specifications. We address this problem by introducing a framework for provably safe model updates. Our approach first formalizes the problem as computing the largest locally invariant domain (LID): a connected region in parameter space where all points are certified to satisfy a given specification. While exact maximal LID computation is intractable, we show that relaxing the problem to parameterized abstract domains (orthotopes, zonotopes) yields a tractable primal-dual formulation. This enables efficient certification of updates - independent of the data or algorithm used - by projecting them onto the safe domain. Our formulation further allows computation of multiple approximately optimal LIDs, incorporation of regularization-inspired biases, and use of lookahead data buffers. Across continual learning and foundation model fine-tuning benchmarks, our method matches or exceeds heuristic baselines for avoiding forgetting while providing formal safety guarantees.

AI Summary

Fine-tuning: The process of adapting a pre-trained model to a specific task or dataset. [3]
The method may slightly degrade downstream accuracy. [3]
The paper presents a method for achieving formal guarantees of forgetting and alignment preservation in neural networks, particularly in the context of fine-tuning pre-trained language models (LLMs). [2]

Minimizing the Number of Code Switching Operations in Fault-Tolerant Quantum Circuits

Technical University of M

Rate paper: 👍 👎 ♥ Save

Abstract
Fault-tolerant quantum computers rely on Quantum Error-Correcting Codes (QECCs) to protect information from noise. However, no single error-correcting code supports a fully transversal and therefore fault-tolerant implementation of all gates required for universal quantum computation. Code switching addresses this limitation by moving quantum information between different codes that, together, support a universal gate set. Unfortunately, each switch is costly-adding time and space overhead and increasing the logical error rate. Minimizing the number of switching operations is, therefore, essential for quantum computations using code switching. In this work, we study the problem of minimizing the number of code switches required to run a given quantum circuit. We show that this problem can be solved efficiently in polynomial time by reducing it to a minimum-cut instance on a graph derived from the circuit. Our formulation is flexible and can incorporate additional considerations, such as reducing depth overhead by preferring switches during idle periods or biasing the compilation to favor one code over another. To the best of our knowledge, this is the first automated approach for compiling and optimizing code-switching-based quantum computations at the logical level.

Data Science Development Environment and Productivity

Integrative Analysis of Risk Management Methodologies in Data Science Projects

UnB

Rate paper: 👍 👎 ♥ Save

Abstract
Data science initiatives frequently exhibit high failure rates, driven by technical constraints, organizational limitations and insufficient risk management practices. Challenges such as low data maturity, lack of governance, misalignment between technical and business teams, and the absence of structured mechanisms to address ethical and sociotechnical risks have been widely identified in the literature. In this context, the purpose of this study is to conduct a comparative analysis of the main risk management methodologies applied to data science projects, aiming to identify, classify, and synthesize their similarities, differences and existing gaps. An integrative literature review was performed using indexed databases and a structured protocol for selection and content analysis. The study examines widely adopted risk management standards ISO 31000, PMBOK Risk Management and NIST RMF, as well as frameworks specific to data science workflows, such as CRISP DM and the recently proposed DS EthiCo RMF, which incorporates ethical and sociotechnical dimensions into the project life cycle. The findings reveal that traditional approaches provide limited coverage of emerging risks, whereas contemporary models propose multidimensional structures capable of integrating ethical oversight, governance and continuous monitoring. As a contribution, this work offers theoretical support for the development of hybrid frameworks that balance technical efficiency, organizational alignment and responsible data practices, while highlighting research gaps that can guide future investigations.

AI Summary

Data Science: The study of extracting insights from large datasets using various techniques, including machine learning, statistics, and visualization. [3]
A systematic literature review was conducted to analyze the application of risk management frameworks in data science projects. [2]

Enterprise Data Science Platform: A Unified Architecture for Federated Data Access

Waseda University

Rate paper: 👍 👎 ♥ Save

Abstract
Organizations struggle to share data across departments that have adopted different data analytics platforms. If n datasets must serve m environments, up to n*m replicas can emerge, increasing inconsistency and cost. Traditional warehouses copy data into vendor-specific stores; cross-platform access is hard. This study proposes the Enterprise Data Science Platform (EDSP), which builds on data lakehouse architecture and follows a Write-Once, Read-Anywhere principle. EDSP enables federated data access for multi-query engine environments, targeting data science workloads with periodic data updates and query response times ranging from seconds to minutes. By providing centralized data management with federated access from multiple query engines to the same data sources, EDSP eliminates data duplication and vendor lock-in inherent in traditional data warehouses. The platform employs a four-layer architecture: Data Preparation, Data Store, Access Interface, and Query Engines. This design enforces separation of concerns and reduces the need for data migration when integrating additional analytical environments. Experimental results demonstrate that major cloud data warehouses and programming environments can directly query EDSP-managed datasets. We implemented and deployed EDSP in production, confirming interoperability across multiple query engines. For data sharing across different analytical environments, EDSP achieves a 33-44% reduction in operational steps compared with conventional approaches requiring data migration. Although query latency may increase by up to a factor of 2.6 compared with native tables, end-to-end completion times remain on the order of seconds, maintaining practical performance for analytical use cases. Based on our production experience, EDSP provides practical design guidelines for addressing the data-silo problem in multi-query engine environments.

AI Summary

{ "title": "Enterprise Data Science Platform (EDSP)", "description": "A unified data management architecture that addresses data management challenges in multi-query engine environments." } { "term": "Write-Once, Read-Anywhere", "definition": "A principle that enables data to be written once and read from multiple query engines without replication or duplication." } { "title": "EDSP Demonstrates Practical Solution to Data Silos in Multi-Query Engine Enterprises" , "description": "The Enterprise Data Science Platform (EDSP) demonstrates that the Write-Once, Read-Anywhere principle can be realized in production environments, offering a practical solution to the long-standing problem of data silos in multi-query engine enterprises." } { "title": "Limited Performance Validation" , "description": "Future work includes performance validation on TB-scale datasets." } { "title": "Data Lake Architectures and Metadata Management" , "description": "The paper references a study on data lake architectures and metadata management, highlighting the importance of metadata in data sharing across heterogeneous query engines." } The paper proposes the Enterprise Data Science Platform (EDSP), a unified data management architecture grounded in the Write-Once, Read-Anywhere principle, to address data management challenges in multi-query engine environments. [2]

Machine Learning Infrastructure

milearn: A Python Package for Multi-Instance Machine Learning

Aix Marseille University

Rate paper: 👍 👎 ♥ Save

Rate image: 👍 👎

Abstract
We introduce milearn, a Python package for multi-instance learning (MIL) that follows the familiar scikit-learn fit/predict interface while providing a unified framework for both classical and neural-network-based MIL algorithms for regression and classification. The package also includes built-in hyperparameter optimization designed specifically for small MIL datasets, enabling robust model selection in data-scarce scenarios. We demonstrate the versatility of milearn across a broad range of synthetic MIL benchmark datasets, including digit classification and regression, molecular property prediction, and protein-protein interaction (PPI) prediction. Special emphasis is placed on the key instance detection (KID) problem, for which the package provides dedicated support.

AI Summary

The paper discusses a Python package called milearn that provides a unified and flexible framework for multi-instance learning (MIL). [3]
Milearn integrates traditional and modern MIL algorithms, key instance detection, and stepwise hyperparameter optimization within an interface consistent with the scikit-learn ecosystem. [3]
Experiments on diverse benchmarks demonstrate milearn's ability to support both accurate bag-level prediction and interpretable instance-level insight across various application domains. [3]
Multi-instance learning (MIL): A type of machine learning where each instance is a bag of features, and the goal is to learn from these bags. [3]
Key instance detection: The process of identifying the most relevant instances within a bag that contribute to the overall classification or regression task. [3]
Stepwise hyperparameter optimization: An approach to optimizing model parameters by iteratively adjusting them based on performance metrics. [3]
Experiments demonstrate milearn's effectiveness in supporting accurate bag-level prediction and interpretable instance-level insight across diverse domains. [3]
The paper also discusses other related works in the field of MIL, including deep learning methods, attention-based networks, and multi-resolution models. [2]

DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle

CAS

Rate paper: 👍 👎 ♥ Save

Abstract
Real-world enterprise data intelligence workflows encompass data engineering that turns raw sources into analytical-ready tables and data analysis that convert those tables into decision-oriented insights. We introduce DAComp, a benchmark of 210 tasks that mirrors these complex workflows. Data engineering (DE) tasks require repository-level engineering on industrial schemas, including designing and building multi-stage SQL pipelines from scratch and evolving existing systems under evolving requirements. Data analysis (DA) tasks pose open-ended business problems that demand strategic planning, exploratory analysis through iterative coding, interpretation of intermediate results, and the synthesis of actionable recommendations. Engineering tasks are scored through execution-based, multi-metric evaluation. Open-ended tasks are assessed by a reliable, experimentally validated LLM-judge, which is guided by hierarchical, meticulously crafted rubrics. Our experiments reveal that even state-of-the-art agents falter on DAComp. Performance on DE tasks is particularly low, with success rates under 20%, exposing a critical bottleneck in holistic pipeline orchestration, not merely code generation. Scores on DA tasks also average below 40%, highlighting profound deficiencies in open-ended reasoning and demonstrating that engineering and analysis are distinct capabilities. By clearly diagnosing these limitations, DAComp provides a rigorous and realistic testbed to drive the development of truly capable autonomous data agents for enterprise settings. Our data and code are available at https://da-comp.github.io

AI Summary

DAComp is a comprehensive benchmark designed to evaluate data agents across the full data intelligence lifecycle. [3]
The benchmark aims to steer the community beyond mere technical accuracy, driving the evolution of truly autonomous and capable data agents for the enterprise. [3]
Data Agent (DA): An LLM-driven autonomous system that plans and executes end-to-end workflows, acquiring, transforming, and analyzing data via tool use and code execution to achieve user-defined objectives. [3]
LLM: Large Language Model DAComp: Data Agent Comprehensive Benchmark DAComp is a rigorous standard for evaluating data agents, bridging the gap between isolated code generation and real-world enterprise demands. [3]
The benchmark includes two testbeds: DAComp-DE for repository-level pipeline orchestration and DAComp-DA for open-ended analytical reasoning. [2]

Online inference

Beyond Bayesian Inference: The Correlation Integral Likelihood Framework and Gradient Flow Methods for Deterministic Sampling

University of Warsaw

Rate paper: 👍 👎 ♥ Save

Rate image: 👍 👎

Abstract
Calibrating mathematical models of biological processes is essential for achieving predictive accuracy and gaining mechanistic insight. However, this task remains challenging due to limited and noisy data, significant biological variability, and the computational complexity of the models themselves. In this method's article, we explore a range of approaches for parameter inference in partial differential equation (PDE) models of biological systems. We introduce a unified mathematical framework, the Correlation Integral Likelihood (CIL) method, for parameter estimation in systems exhibiting heterogeneous or chaotic dynamics, encompassing both pattern formation models and individual-based models. Departing from classical Bayesian inverse problem methodologies, we motivate the development of the CIL method, demonstrate its versatility, and highlight illustrative applications within mathematical biology. Furthermore, we compare stochastic sampling strategies, such as Markov Chain Monte Carlo (MCMC), with deterministic gradient flow approaches, highlighting how these methods can be integrated within the proposed framework to enhance inference performance. Our work provides a practical and theoretically grounded toolbox for researchers seeking to calibrate complex biological models using incomplete, noisy, or heterogeneous data, thereby advancing both the predictive capability and mechanistic understanding of such systems.

AI Summary

The authors highlight the limitations of traditional stochastic samplers, such as slow mixing and poor scalability in high dimensions, and introduce deterministic approaches that leverage the geometry of the underlying distribution to guide samples along trajectories that minimize a chosen divergence measure. [2]
The paper discusses various methods for sampling and parameter estimation in Bayesian inference, including Monte Carlo methods, gradient flow approaches, and deterministic particle-based approximations. [1]

AutoNeural: Co-Designing Vision-Language Models for NPU Inference

Nexa AI

Rate paper: 👍 👎 ♥ Save

Abstract
While Neural Processing Units (NPUs) offer high theoretical efficiency for edge AI, state-of-the-art Vision--Language Models (VLMs) tailored for GPUs often falter on these substrates. We attribute this hardware-model mismatch to two primary factors: the quantization brittleness of Vision Transformers (ViTs) and the I/O-bound nature of autoregressive attention mechanisms, which fail to utilize the high arithmetic throughput of NPUs. To bridge this gap, we propose AutoNeural, an NPU-native VLM architecture co-designed for integer-only inference. We replace the standard ViT encoder with a MobileNetV5-style backbone utilizing depthwise separable convolutions, which ensures bounded activation distributions for stable INT4/8/16 quantization. Complementing this, our language backbone integrates State-Space Model (SSM) principles with Transformer layers, employing efficient gated convolutions to achieve linear-time complexity. This hybrid design eliminates the heavy memory I/O overhead of Key-Value caching during generation. Our approach delivers substantial efficiency gains, reducing quantization error of vision encoder by up to 7x and end-to-end latency by 14x compared to conventional baselines. The AutoNeural also delivers 3x decoding speed and 4x longer context window than the baseline. We validate these improvements via a real-world automotive case study on the Qualcomm SA8295P SoC, demonstrating real-time performance for cockpit applications. Our results highlight that rethinking model topology specifically for NPU constraints is a prerequisite for robust multi-modal edge intelligence.

Data Science Development Tools

A Datalake for Data-driven Social Science Research

Ashoka University

Rate paper: 👍 👎 ♥ Save

Rate image: 👍 👎

Abstract
Social science research increasingly demands data-driven insights, yet researchers often face barriers such as lack of technical expertise, inconsistent data formats, and limited access to reliable datasets.Social science research increasingly demands data-driven insights, yet researchers often face barriers such as lack of technical expertise, inconsistent data formats, and limited access to reliable datasets. In this paper, we present a Datalake infrastructure tailored to the needs of interdisciplinary social science research. Our system supports ingestion and integration of diverse data types, automatic provenance and version tracking, role-based access control, and built-in tools for visualization and analysis. We demonstrate the utility of our Datalake using real-world use cases spanning governance, health, and education. A detailed walkthrough of one such use case -- analyzing the relationship between income, education, and infant mortality -- shows how our platform streamlines the research process while maintaining transparency and reproducibility. We argue that such infrastructure can democratize access to advanced data science practices, especially for NGOs, students, and grassroots organizations. The Datalake continues to evolve with plans to support ML pipelines, mobile access, and citizen data feedback mechanisms.

AI Summary

The Datalake is a cloud-based platform that enables users to perform simple and complex analytical tasks on multiple datasets. [3]
It provides an end-to-end solution for data management, including provenance and version tracking. [3]
The ease of use of the Datalake is key in democratizing access to data and good data science practices. [3]
Datalake: A cloud-based platform that enables users to perform simple and complex analytical tasks on multiple datasets. [3]
Provenance: The origin or history of a dataset or its components. [3]
Version tracking: The process of keeping track of changes made to a dataset over time. [3]
Funding support by Mphasis AI Lab at Ashoka University. [3]
Access to data and analysis tools are the most important factors in lowering barriers for NGOs, grassroots organizations, and students who may not be well-versed in using computer science tools for data processing. [2]
Limited user engagement with social scientists. [1]

Machine Learning Operations

AI-Driven Optimization under Uncertainty for Mineral Processing Operations

MIT

Rate paper: 👍 👎 ♥ Save

Abstract
The global capacity for mineral processing must expand rapidly to meet the demand for critical minerals, which are essential for building the clean energy technologies necessary to mitigate climate change. However, the efficiency of mineral processing is severely limited by uncertainty, which arises from both the variability of feedstock and the complexity of process dynamics. To optimize mineral processing circuits under uncertainty, we introduce an AI-driven approach that formulates mineral processing as a Partially Observable Markov Decision Process (POMDP). We demonstrate the capabilities of this approach in handling both feedstock uncertainty and process model uncertainty to optimize the operation of a simulated, simplified flotation cell as an example. We show that by integrating the process of information gathering (i.e., uncertainty reduction) and process optimization, this approach has the potential to consistently perform better than traditional approaches at maximizing an overall objective, such as net present value (NPV). Our methodological demonstration of this optimization-under-uncertainty approach for a synthetic case provides a mathematical and computational framework for later real-world application, with the potential to improve both the laboratory-scale design of experiments and industrial-scale operation of mineral processing circuits without any additional hardware.

AI Summary

The authors propose an online solver that uses Monte Carlo tree search to explore the action space and make decisions in real-time. [3]
The online solver proposed by the authors is able to make decisions in real-time using Monte Carlo tree search, which allows for efficient exploration of the action space. [3]
The paper presents a novel approach to optimize mineral processing circuits under uncertainty using Partially Observable Markov Decision Processes (POMDPs). [2]

Refining Machine Learning Potentials through Thermodynamic Theory of Phase Transitions

Technical University of

Rate paper: 👍 👎 ♥ Save

Abstract
Foundational Machine Learning Potentials can resolve the accuracy and transferability limitations of classical force fields. They enable microscopic insights into material behavior through Molecular Dynamics simulations, which can crucially expedite material design and discovery. However, insufficiently broad and systematically biased reference data affect the predictive quality of the learned models. Often, these models exhibit significant deviations from experimentally observed phase transition temperatures, in the order of several hundred kelvins. Thus, fine-tuning is necessary to achieve adequate accuracy in many practical problems. This work proposes a fine-tuning strategy via top-down learning, directly correcting the wrongly predicted transition temperatures to match the experimental reference data. Our approach leverages the Differentiable Trajectory Reweighting algorithm to minimize the free energy differences between phases at the experimental target pressures and temperatures. We demonstrate that our approach can accurately correct the phase diagram of pure Titanium in a pressure range of up to 5 GPa, matching the experimental reference within tenths of kelvins and improving the liquid-state diffusion constant. Our approach is model-agnostic, applicable to multi-component systems with solid-solid and solid-liquid transitions, and compliant with top-down training on other experimental properties. Therefore, our approach can serve as an essential step towards highly accurate application-specific and foundational machine learning potentials.

AI Summary

Machine learning potential: A mathematical model that uses machine learning algorithms to predict the behavior of materials at the atomic level. [3]
Molecular dynamics simulation: A computational method used to study the behavior of molecules over time, often used in conjunction with machine learning potentials. [3]
Chemical potential: The energy associated with adding or removing an atom from a system. [3]
Computational cost: Molecular dynamics simulations can be computationally intensive, making it challenging to apply this method to larger systems or longer timescales. [3]
The authors also discuss the limitations of these models and propose a new approach to improve their accuracy. [3]
The article discusses the development of machine learning potentials for accurate properties and applications to the mechanical response of titanium. [2]
Phase transition: A change in the structure or properties of a material, such as from solid to liquid. [1]

Machine Learning Deployment

Using Machine Learning to Take Stay-or-Go Decisions in Data-driven Drone Missions

University of Thessaly

Rate paper: 👍 👎 ♥ Save

Abstract
Drones are becoming indispensable in many application domains. In data-driven missions, besides sensing, the drone must process the collected data at runtime to decide whether additional action must be taken on the spot, before moving to the next point of interest. If processing does not reveal an event or situation that requires such an action, the drone has waited in vain instead of moving to the next point. If, however, the drone starts moving to the next point and it turns out that a follow-up action is needed at the previous point, it must spend time to fly-back. To take this decision, we propose different machine-learning methods based on branch prediction and reinforcement learning. We evaluate these methods for a wide range of scenarios where the probability of event occurrence changes with time. Our results show that the proposed methods consistently outperform the regression-based method proposed in the literature and can significantly improve the worst-case mission time by up to 4.1x. Also, the achieved median mission time is very close, merely up to 2.7% higher, to that of a method with perfect knowledge of the current underlying event probability at each point of interest.

AI Summary

The paper proposes different machine-learning methods for deciding whether to wait for data processing to finish before moving to the next point of interest or to start this movement while the computation is still being performed. [2]

Machine Learning Validation

Efficient and Intuitive Two-Phase Validation Across Multiple Models via Principal Components

Wake Forest University

Rate paper: 👍 👎 ♥ Save

Abstract
Two-phase sampling offers a cost-effective way to validate error-prone measurements in observational databases or randomized trials. Inexpensive or easy-to-obtain information is collected for the entire study in Phase I. Then, a subset of patients undergoes cost-intensive validation to collect more accurate data in Phase II. Critically, any Phase I variables can be used to strategically select the Phase II subset, often enriched for a particular model of interest. However, when balancing primary and secondary analyses in the same study, competing models and priorities can result in poorly defined objectives for the most informative Phase II sampling criterion. We propose an intuitive, easy-to-use solution that balances and prioritizes explaining the largest amount of variability across all models of interest. Using principal components to succinctly summarize the inherent variability of the error-prone covariates for all models. Then, we sample patients with the most "extreme" principal components (i.e., the smallest or largest values) for validation. Through simulations and an application to data from the National Health and Nutrition Examination Survey (NHANES), we show that extreme tail sampling on the first principal component offers simultaneous efficiency gains across multiple models of interest relative to sampling for one specific model. Our proposed sampling strategy is implemented in the open-source R package, auditDesignR.

AI Summary

The improvement is evident under varying covariance structures, measurement error severities, and validation proportions. [3]
Principal component analysis (PCA): a dimensionality reduction technique that transforms a set of correlated variables into a new set of uncorrelated variables called principal components. [3]
Further research is needed to investigate the efficiency gains with other common outcome types and to modify the approach for error settings beyond exposure measurement error. [3]
The method assumes that the exposures are measured with error and that the outcome is continuous. [3]
Further research is needed to investigate the efficiency gains with other common outcome types. [3]
The proposed sampling strategy improves the overall efficiency of two-phase estimates across multiple models under mild assumptions. [2]
Two-phase analysis: a statistical method used to analyze data from observational studies where the exposure or outcome is measured with error. [1]

Reusing Model Validation Methods for the Continuous Validation of Digital Twins of Cyber-Physical Systems

University of Antwerp

Rate paper: 👍 👎 ♥ Save

Abstract
One of the challenges in twinned systems is ensuring the digital twin remains a valid representation of the system it twins. Depending on the type of twinning occurring, it is either trivial, such as in dashboarding/visualizations that mirror the system with real-time data, or challenging, in case the digital twin is a simulation model that reflects the behavior of a physical twinned system. The challenge in this latter case comes from the fact that in contrast to software systems, physical systems are not immutable once deployed, but instead they evolve through processes like maintenance, wear and tear or user error. It is therefore important to detect when changes occur in the physical system to evolve the twin alongside it. We employ and reuse validation techniques from model-based design for this goal. Model validation is one of the steps used to gain trust in the representativeness of a simulation model. In this work, we provide two contributions: (i) we provide a generic approach that, through the use of validation metrics, is able to detect anomalies in twinned systems, and (ii) we demonstrate these techniques with the help of an academic yet industrially relevant case study of a gantry crane such as found in ports. Treating anomalies also means correcting the error in the digital twin, which we do with a parameter estimation based on the historical data.

Model Monitoring

Model-Based Diagnosis with Multiple Observations: A Unified Approach for C Software and Boolean Circuits

University of Oxford

Rate paper: 👍 👎 ♥ Save

Abstract
Debugging is one of the most time-consuming and expensive tasks in software development and circuit design. Several formula-based fault localisation (FBFL) methods have been proposed, but they fail to guarantee a set of diagnoses across all failing tests or may produce redundant diagnoses that are not subset-minimal, particularly for programs/circuits with multiple faults. This paper introduces CFaults, a novel fault localisation tool for C software and Boolean circuits with multiple faults. CFaults leverages Model-Based Diagnosis (MBD) with multiple observations and aggregates all failing test cases into a unified Maximum Satisfiability (MaxSAT) formula. Consequently, our method guarantees consistency across observations and simplifies the fault localisation procedure. Experimental results on three benchmark sets, two of C programs, TCAS and C-Pack-IPAs, and one of Boolean circuits, ISCAS85, show that CFaults is faster at localising faults in C software than other FBFL approaches such as BugAssist, SNIPER, and HSD. On the ISCAS85 benchmark, CFaults is generally slower than HSD; however, it localises faults in only 6% fewer circuits, demonstrating that it remains competitive in this domain. Furthermore, CFaults produces only subset-minimal diagnoses of faulty statements, whereas the other approaches tend to enumerate redundant diagnoses (e.g., BugAssist and SNIPER).

AI Summary

Unrolling involves expanding the faulty circuit or program using failing input-output observations from the test suite. [3]
Instrumentalization uses relaxation variables to control the activation of each component in the unrolled circuit or program. [3]
The MaxSAT formula is generated by consolidating all failing tests into a unified, unrolled, and instrumentalized circuit or program. [3]
Model-Based Diagnosis (MBD): A theory that aims to identify the minimum set of faulty components required to explain a system's observed behavior. [3]
C Program: A program written in the C programming language. [3]
CFaults is a novel fault localization technique that combines Model-Based Diagnosis (MBD) theory with MaxSAT solving. [2]
The approach consists of four main steps: unrolling, instrumentalization, encoding into MaxSAT, and oracle invocation. [1]

Machine Learning Lifecycle

Research on Milling Machine Predictive Maintenance Based on Machine Learning and SHAP Analysis in Intelligent Manufacturing Environment

Chongqing University of

Rate paper: 👍 👎 ♥ Save

Abstract
In the context of intelligent manufacturing, this paper conducts a series of experimental studies on the predictive maintenance of industrial milling machine equipment based on the AI4I 2020 dataset. This paper proposes a complete predictive maintenance experimental process combining artificial intelligence technology, including six main links: data preprocessing, model training, model evaluation, model selection, SHAP analysis, and result visualization. By comparing and analyzing the performance of eight machine learning models, it is found that integrated learning methods such as XGBoost and random forest perform well in milling machine fault prediction tasks. In addition, with the help of SHAP analysis technology, the influence mechanism of different features on equipment failure is deeply revealed, among which processing temperature, torque and speed are the key factors affecting failure. This study combines artificial intelligence and manufacturing technology, provides a methodological reference for predictive maintenance practice in an intelligent manufacturing environment, and has practical significance for promoting the digital transformation of the manufacturing industry, improving production efficiency and reducing maintenance costs.

Interests not found

Help us improve your experience!