Gluing Local Contexts into Global Meaning: A Sheaf-Theoretic Decomposition of Transformer Representations

Case Western Reserve University
ICLR 2026 Workshop on Unifying Concept Representation Learning (UCRL)

Abstract

We decompose transformer activations into content-stable (H0) and context-dependent (H1) subspaces using sheaf cohomology. A cellular sheaf built over paraphrase graphs yields a Laplacian whose spectral structure separates phrasing-invariant directions from maximally varying ones, requiring no concept labels or supervised training. Across five models (124M–13B parameters), H1 dimensions exert 3.5–26.5× greater causal influence on model output than variance-matched controls (Cohen's d = 2.3–14.3), H0 retrieves facts at 60–68% accuracy using only 20 dimensions, and the two subspaces produce opposite effects under ablation.

The decomposition also reveals architecture-dependent fragility: Llama-2-7B collapses under random perturbation (4.2% fact preservation) while all directed methods preserve facts at 12–14% (p < 10−10, n=1000); with architecture-specific restriction maps this gap widens to 31.0% vs. 4.2% (p < 10−50). Robust models tolerate both perturbation types. Sheaf H0 outperforms LEACE concept erasure by nearly 2× on fact retrieval, and persistent homology reveals that topological complexity in transformer activations emerges layer-dependently, with deeper layers of larger models exhibiting more persistent H1 structure than random baselines.

BibTeX

@inproceedings{
grant2026gluing,
title={Gluing Local Contexts into Global Meaning: A Sheaf-Theoretic Decomposition of Transformer Representations},
author={Bryce Grant and Peng Wang},
booktitle={ICLR 2026 Workshop on Unifying Concept Representation Learning},
year={2026},
url={https://openreview.net/forum?id=eub5YrhExo}
}