Tag: Efficiency
All the articles with the tag "Efficiency".
-
Concise Reasoning, Big Gains: Pruning Long Reasoning Trace with Difficulty-Aware Prompting
本文提出难度感知提示(DAP)方法,通过动态调整推理轨迹长度构建精简的LiteCoT数据集(100K样本,平均720token),训练的Liter模型在多个推理基准上显著优于传统长CoT方法,同时大幅降低训练和推理成本。
-
R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning
本文提出R1-Reward,通过StableReinforce算法将强化学习应用于多模态奖励模型训练,显著提升了性能并在多个基准测试中超越现有最优模型,同时展示了优异的数据效率和测试时扩展性。
-
Effective Length Extrapolation via Dimension-Wise Positional Embeddings Manipulation
本文提出DPE,一种无需训练的长文本外推方法,通过检测RoPE不同维度组的有效相对距离并识别关键维度,有选择地调整这些关键维度的位置索引,显著扩展了LLM的上下文窗口并提升了长文本任务性能。
-
Discriminative Finetuning of Generative Large Language Models without Reward Models and Human Preference Data
本文提出判别式微调(DFT)框架,通过判别式概率模型优化大型语言模型的输出概率,无需人类偏好数据或奖励模型,在数学推理和通用语言任务上显著优于SFT并与SFT→PO方法相当。
-
Don't be lazy: CompleteP enables compute-efficient deep transformers
This paper introduces CompleteP, a parameterization for transformers with α = 1, which ensures depth-wise hyperparameter transfer and complete feature learning, achieving 12-34% compute efficiency improvements and enabling a wider range of compute-optimal width-to-depth ratios.