Tag: Fine-tuning
All the articles with the tag "Fine-tuning".
-
Unlocking Efficient Long-to-Short LLM Reasoning with Model Merging
本文通过模型融合方法整合快速思维和慢速推理能力,实现长到短推理,在7B模型上将响应长度压缩高达55%且保持性能,提出了一种高效解决大语言模型过度思考问题的方案。
-
Towards Revealing the Effectiveness of Small-Scale Fine-tuning in R1-style Reinforcement Learning
本文通过理论分析和Re-distillation技术,揭示了小规模SFT在R1风格RL中的效率瓶颈,并以极少样本(<1K)在K&K和MATH数据集上接近RL性能,显著提升了数据效率。
-
The Unreasonable Effectiveness of Model Merging for Cross-Lingual Transfer in LLMs
本文通过模块化方法,利用大型语言模型参数在数学推理和多语言能力上的分离性,提出Layer-Swapping等策略,在低资源语言跨语言迁移中显著优于非模块化基线,尤其在数据受限场景下表现最佳。
-
Gameplay Highlights Generation
This paper presents a method to generate gameplay highlight reels by finetuning the X-CLIP multimodal model on an in-house FPS game dataset, achieving over 90% event detection accuracy and demonstrating transfer learning, while optimizing deployment through quantization.
-
TL;DR: Too Long, Do Re-weighting for Effcient LLM Reasoning Compression
本文提出TLDR方法,通过动态再加权系统1和系统2推理数据,显著压缩大型语言模型的推理输出token数量(约40%),同时在多难度数学任务上基本保持准确性。