1991–, Computer scientist
Albert Gu is an American computer scientist whose 2021 paper Efficiently Modeling Long Sequences with Structured State Spaces (S4) and the 2023 paper Mamba: Linear-Time Sequence Modeling with Selective State Spaces (with Tri Dao) introduced the modern state-space model (SSM) approach to sequence modelling. Mamba combines structured state space models with input-dependent selection mechanisms to achieve Transformer-quality language modelling at linear time complexity in sequence length, versus the Transformer's quadratic.
Gu completed his Stanford PhD under Christopher Ré in 2023 and joined Carnegie Mellon as a faculty member. State-space models are the most serious architectural alternative to Transformers to emerge in the past decade; Mamba and its successors are now widely used in production systems and represent an active research front.
Video
Related people: Tri Dao
Works cited in this book:
- Mamba: Linear-Time Sequence Modeling with Selective State Spaces (2024) (with Tri Dao)
Discussed in:
- Chapter 13: Attention & Transformers, Attention and Transformers