Tag: Fine-tuning
All the articles with the tag "Fine-tuning".
-   Not All Correct Answers Are Equal: Why Your Distillation Source Matters本文通过从三个顶尖大语言模型中提炼189万推理数据,系统研究了提炼源对学生模型性能的影响,发现AM-Thinking-v1提炼数据在多个推理基准上显著提升学生模型表现,并展现出适应性生成长度特性。 
-   Theoretical Insights into Fine-Tuning Attention Mechanism: Generalization and OptimizationThis paper introduces a fine-tuning strategy for LLMs that leverages the unequal importance of attention matrices and customized learning rates to enhance efficiency, demonstrating through theoretical analysis and experiments on GLUE benchmarks that fine-tuning only Wq and Wv with higher learning rates for Wv can match or exceed full fine-tuning performance with fewer parameters. 
-   R-LoRA: Randomized Multi-Head LoRA for Efficient Multi-Task LearningR-LoRA通过多头随机化(包括多头Dropout和随机初始化)增强了LoRA在多任务学习中的性能,有效提升了任务特定知识的捕获能力,同时降低了GPU内存使用和训练时间。 
-   ALPS: Attention Localization and Pruning Strategy for Efficient Alignment of Large Language Models本文提出 ALPS 算法,通过基于权重分布的参数对齐分布分数(sPAD)定位任务敏感注意力头并剪枝,仅更新 10% 的注意力参数即在通用、数学和代码任务上实现性能提升,同时展现头部可转移性和知识遗忘缓解效果。 
-   Recall with Reasoning: Chain-of-Thought Distillation for Mamba's Long-Context Memory and ExtrapolationThis paper proposes Recall with Reasoning (RwR), a method that enhances Mamba's long-context memory and extrapolation by distilling chain-of-thought summarization from a teacher model, achieving significant performance improvements on LONGMEMEVAL and HELMET benchmarks while preserving short-context capabilities.