Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, & Weizhu Chen (2021)
arXiv.
DOI: https://doi.org/10.48550/arxiv.2106.09685
Abstract. Introduces LoRA, which freezes the base model and injects small trainable low-rank matrices alongside each weight matrix of interest, reducing the number of trainable parameters by three orders of magnitude while matching full fine-tuning quality.
Tags: efficiency fine-tuning lora