Fakult at f ur Mathemat
Abstract
Black-box optimization (BBO) addresses problems where objectives are
accessible only through costly queries without gradients or explicit structure.
Classical derivative-free methods -- line search, direct search, and
model-based solvers such as Bayesian optimization -- form the backbone of BBO,
yet often struggle in high-dimensional, noisy, or mixed-integer settings.
Recent advances use machine learning (ML) and reinforcement learning (RL) to
enhance BBO: ML provides expressive surrogates, adaptive updates, meta-learning
portfolios, and generative models, while RL enables dynamic operator
configuration, robustness, and meta-optimization across tasks.
This paper surveys these developments, covering representative algorithms
such as NNs with the modular model-based optimization framework (mlrMBO),
zeroth-order adaptive momentum methods (ZO-AdaMM), automated BBO (ABBO),
distributed block-wise optimization (DiBB), partition-based Bayesian
optimization (SPBOpt), the transformer-based optimizer (B2Opt),
diffusion-model-based BBO, surrogate-assisted RL for differential evolution
(Surr-RLDE), robust BBO (RBO), coordinate-ascent model-based optimization with
relative entropy (CAS-MORE), log-barrier stochastic gradient descent (LB-SGD),
policy improvement with black-box (PIBB), and offline Q-learning with Mamba
backbones (Q-Mamba).
We also review benchmark efforts such as the NeurIPS 2020 BBO Challenge and
the MetaBox framework. Overall, we highlight how ML and RL transform classical
inexact solvers into more scalable, robust, and adaptive frameworks for
real-world optimization.
Sorbonne Universit, CNRS
Abstract
This work evaluates state-of-the-art convolution algorithms for CPU-based
deep learning inference. While most prior studies focus on GPUs or NPUs, CPU
implementations remain relatively underoptimized. We benchmark direct,
GEMM-based, and Winograd convolutions across modern CPUs from ARM __ , Intel __
, AMD __ , Apple __ , and Nvidia __ , considering both latency and energy
efficiency. Our results highlight the key architectural factors that govern CPU
efficiency for convolution operations, providing practical guidance for
energy-aware embedded deployment. As a main results of this work, the Nvidia __
AGX Orin combined with the GEMM algorithm achieves the best trade-off between
inference latency and energy consumption.