RT Series

Paper-reading notes: RT-1 and RT-2
January 9, 2026 · 1731 words

Diffusion Policy: Visuomotor Policy Learning via Action Diffusion

Paper-reading notes: Diffusion Policy
December 28, 2025 · 672 words

Synthesizer: Rethinking Self-Attention for Transformer Models

Paper-reading notes: Synthesizer
December 16, 2025 · 244 words

Learning Transformer Programs

Paper-reading notes: Learning Transformer Programs
December 15, 2025 · 339 words

Reformer: The Efficient Transformer

Paper-reading notes: Reformer
December 14, 2025 · 287 words

OpenVLA: An Open-Source Vision-Language-Action Model

Paper-reading notes: OpenVLA
December 12, 2025 · 312 words

Bayesian Optimization is Superior to Random Search for Machine Learning Hyperparameter Tuning

Paper-reading notes: Bayesian Optimization
December 10, 2025 · 864 words

Random Search for Hyper-Parameter Optimization

Paper-reading notes: Random Search for Hyper-Parameter Optimization
December 10, 2025 · 774 words

ALTA: Compiler-Based Analysis of Transformers

Paper-reading notes: ALTA
December 9, 2025 · 720 words

Tracr: Compiled Transformers as a Laboratory for Interpretability

Paper-reading notes: Tracr
December 8, 2025 · 59 words