Abstract
Many have observed that the development and deployment of generative machine
learning (ML) and artificial intelligence (AI) models follow a distinctive
pattern in which pre-trained models are adapted and fine-tuned for specific
downstream tasks. However, there is limited empirical work that examines the
structure of these interactions. This paper analyzes 1.86 million models on
Hugging Face, a leading peer production platform for model development. Our
study of model family trees -- networks that connect fine-tuned models to their
base or parent -- reveals sprawling fine-tuning lineages that vary widely in
size and structure. Using an evolutionary biology lens to study ML models, we
use model metadata and model cards to measure the genetic similarity and
mutation of traits over model families. We find that models tend to exhibit a
family resemblance, meaning their genetic markers and traits exhibit more
overlap when they belong to the same model family. However, these similarities
depart in certain ways from standard models of asexual reproduction, because
mutations are fast and directed, such that two `sibling' models tend to exhibit
more similarity than parent/child pairs. Further analysis of the directional
drifts of these mutations reveals qualitative insights about the open machine
learning ecosystem: Licenses counter-intuitively drift from restrictive,
commercial licenses towards permissive or copyleft licenses, often in violation
of upstream license's terms; models evolve from multi-lingual compatibility
towards english-only compatibility; and model cards reduce in length and
standardize by turning, more often, to templates and automatically generated
text. Overall, this work takes a step toward an empirically grounded
understanding of model fine-tuning and suggests that ecological models and
methods can yield novel scientific insights.
Abstract
A large body of research has employed Machine Learning (ML) models to develop
learned operating systems (OSes) and kernels. The latter dynamically adapts to
the job load and dynamically adjusts resources (CPU, IO, memory, network
bandwidth) allocation to respond to the actual user demand. What this work has
in common is that it utilizes ML to improve kernel decisions. To this day, and
to the best of our knowledge, no work has taken the opposite direction, i.e.,
using OS to improve ML. While some work proposes applying system-level
optimizations to ML algorithms, they do not tailor the OS to adapt to the ML
context. To address this limitation, we take an orthogonal approach in this
paper by leveraging the OS to enhance the performance of ML models and
algorithms. We explore the path towards an ML-specialized OS, MaLV-OS. MaLV-OS
rethinks the OS architecture to make it specifically tailored to ML workloads,
especially in virtualized clouds, which are now widely used to run ML
applications. MaLV-OS envisioned architecture includes (1) a micro-kernel,
Micro-LAKE, which allows kernel space applications to use the GPU, and (2) an
MLaaS (ML as a Service) subsystem that gathers ML models to help Micro-LAKE
with memory management and CPU scheduling. MaLV-OS architecture also offloads
system-sensitive parts of the models to the OS, to lighten the model complexity
and programming, and speed up its execution. Finally, MaLV-OS integrates an
open-source GPU virtualization software, merged directly into the hypervisor.
For more flexibility, MaLV-OS vision is to enable the virtual machine to
dynamically select MLaaS policies that can improve the performance of the model
the user is running. Because MLaaS is designed as loadable kernel modules, the
MaLV-OS architecture enables the dynamic addition of new capabilities to the
MLaaS subsystem.