Abstract
We propose and test the LLM Brain Rot Hypothesis: continual exposure to junk
web text induces lasting cognitive decline in large language models (LLMs). To
causally isolate data quality, we run controlled experiments on real Twitter/X
corpora, constructing junk and reversely controlled datasets via two orthogonal
operationalizations: M1 (engagement degree) and M2 (semantic quality), with
matched token scale and training operations across conditions. Contrary to the
control group, continual pre-training of 4 LLMs on the junk dataset causes
non-trivial declines (Hedges' $g>0.3$) on reasoning, long-context
understanding, safety, and inflating "dark traits" (e.g., psychopathy,
narcissism). The gradual mixtures of junk and control datasets also yield
dose-response cognition decay: for example, under M1, ARC-Challenge with Chain
Of Thoughts drops $74.9 \rightarrow 57.2$ and RULER-CWE $84.4 \rightarrow 52.3$
as junk ratio rises from $0\%$ to $100\%$.
Error forensics reveal several key insights. First, we identify
thought-skipping as the primary lesion: models increasingly truncate or skip
reasoning chains, explaining most of the error growth. Second, partial but
incomplete healing is observed: scaling instruction tuning and clean data
pre-training improve the declined cognition yet cannot restore baseline
capability, suggesting persistent representational drift rather than format
mismatch. Finally, we discover that the popularity, a non-semantic metric, of a
tweet is a better indicator of the Brain Rot effect than the length in M1.
Together, the results provide significant, multi-perspective evidence that data
quality is a causal driver of LLM capability decay, reframing curation for
continual pretraining as a \textit{training-time safety} problem and motivating
routine "cognitive health checks" for deployed LLMs.
Abstract
Recent LLM agents have made great use of chain of thought reasoning and
function calling. As their capabilities grow, an important question arises: can
this software represent not only a smart problem-solving tool, but an entity in
its own right, that can plan, design immediate tasks, and reason toward
broader, more ambiguous goals? To study this question, we adopt an open-ended
experimental setting where we augment a pretrained LLM agent with the ability
to generate its own tasks, accumulate knowledge, and interact extensively with
its environment. We study the resulting open-ended agent qualitatively. It can
reliably follow complex multi-step instructions, store and reuse information
across runs, and propose and solve its own tasks, though it remains sensitive
to prompt design, prone to repetitive task generation, and unable to form
self-representations. These findings illustrate both the promise and current
limits of adapting pretrained LLMs toward open-endedness, and point to future
directions for training agents to manage memory, explore productively, and
pursue abstract long-term goals.