Tag: Fine-tuning

All the articles with the tag "Fine-tuning".

Large Vocabulary Size Improves Large Language Models

Published: 5 Jun, 2025 at 11:24 AM

85.40 🤔

本文通过实验证明较大词汇量能显著提升单语大型语言模型在英语和日语任务中的性能，并提出了一种在持续训练中更换词汇表的简单方法以适配目标语言，进一步提升模型表现。
Universal Cross-Tokenizer Distillation via Approximate Likelihood Matching

Published: 24 May, 2025 at 11:15 AM

85.32 🤔

本文提出了一种跨分词器蒸馏方法ALM，通过近似似然匹配实现不同分词器间的知识转移，首次在子词到字节级迁移等场景中取得显著效果，并在多个应用案例中优于现有方法。
Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning

Published: 2 Jun, 2025 at 11:31 AM

85.18 🤔

本文提出PURE框架，通过最小形式信用分配方法利用过程奖励改进大型语言模型的推理能力，实验显示其在数学推理任务上与可验证奖励方法性能相当，且结合少量地面真实信号可进一步提升准确率至53.3%。
Understanding Overadaptation in Supervised Fine-Tuning: The Role of Ensemble Methods

Published: 4 Jun, 2025 at 11:59 AM

85.17 🤔

本文通过理论和实验分析，提出模型集成方法通过平衡‘bias-variance’权衡有效缓解监督微调中的过适应问题，提升下游任务性能并减少预训练知识遗忘。
Fine-tuning Quantized Neural Networks with Zeroth-order Optimization

Published: 25 May, 2025 at 11:24 AM

85.17 🤔

本文提出Quantized Zeroth-order Optimization (QZO)，通过扰动量化尺度参数并结合方向导数裁剪，在量化神经网络上实现零阶优化微调，将内存使用减少18倍以上，并在LLMs和Stable Diffusion上展示出显著的内存效率和一定的性能提升。

Tag: Fine-tuning

Large Vocabulary Size Improves Large Language Models

Universal Cross-Tokenizer Distillation via Approximate Likelihood Matching

Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning

Understanding Overadaptation in Supervised Fine-Tuning: The Role of Ensemble Methods

Fine-tuning Quantized Neural Networks with Zeroth-order Optimization