index

An Advent of Thought

A starting point for making sense of task structure (in machine learning)

Toward A Mathematical Framework for Computation in Superposition

Decomposing Activations into Features: How Many and How do we Find Them? — A Survey

Searching for a model’s concepts by their shape – a theoretical framework

See my LessWrong profile and Google Scholar page for more.