Tag: Optimization
All the articles with the tag "Optimization".
-
Theoretical Insights into Fine-Tuning Attention Mechanism: Generalization and Optimization
This paper introduces a fine-tuning strategy for LLMs that leverages the unequal importance of attention matrices and customized learning rates to enhance efficiency, demonstrating through theoretical analysis and experiments on GLUE benchmarks that fine-tuning only Wq and Wv with higher learning rates for Wv can match or exceed full fine-tuning performance with fewer parameters.