Efficient-Llms

Efficiently Modeling Long Sequences with Structured State Spaces

Paper-reading notes: S4

Paper-reading notes: RetNet

Paper-reading notes: Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation

Paper-reading notes: xLSTM Extended Long Short-Term Memory

Paper-reading notes: RWKV: Reinventing RNNs for the Transformer Era