Thinking Like Transformers

Paper-reading notes: RASP
December 7, 2025 · 273 words

It’s All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization

Paper-reading notes: MIRAS
December 6, 2025 · 923 words

FNet: Mixing Tokens with Fourier Transforms

Paper-reading notes: FNet
December 5, 2025 · 470 words

Linformer: Self-Attention with Linear Complexity

Paper-reading notes: Linformer
December 4, 2025 · 236 words

Rethinking Attention with Performers

Paper-reading notes: Performers
December 3, 2025 · 499 words

On the Representational Capacity of Neural Language Models with Chain-of-Thought Reasoning

Paper-reading notes: On the Representational Capacity of Neural Language Models with Chain-of-Thought Reasoning
December 1, 2025 · 462 words

What Formal Languages Can Transformers Express? A Survey

Paper-reading notes: What Formal Languages Can Transformers Express? A Survey
November 30, 2025 · 327 words

ATLAS: Learning to Optimally Memorize the Context at Test Time

Paper-reading notes: ATLAS
November 29, 2025 · 628 words

Solving olympiad geometry without human demonstrations

Paper-reading notes: AlphaGeometry
November 28, 2025 · 522 words

Formal Mathematical Reasoning A New Frontier in AI

Paper-reading notes: Formal Mathematical Reasoning A New Frontier in AI
November 27, 2025 · 347 words