Tag: Large Language Model
All the articles with the tag "Large Language Model".
-
R1-Code-Interpreter: Training LLMs to Reason with Code via Supervised and Reinforcement Learning
本文提出 R1-Code-Interpreter 框架,通过监督微调和强化学习训练大型语言模型动态生成和执行代码,在 144 个推理和规划任务上显著提升准确率,R1-CI-14B 达到 64.1%,接近 GPT-4o+Code Interpreter 的性能。
-
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions
本文提出 Rodimus 和 Rodimus+ 模型,通过数据依赖温度选择(DDTS)和滑动窗口共享键注意力(SW-SKA)机制,在保持性能的同时显著降低大型语言模型的计算和内存复杂度,挑战了准确性与效率的权衡。
-
MaskSearch: A Universal Pre-Training Framework to Enhance Agentic Search Capability
本文提出 MASKSEARCH 框架,通过 Retrieval-Augmented Mask Prediction (RAMP) 预训练任务结合监督微调和强化学习,显著提升了大型语言模型在开放域多跳问答任务中的代理搜索能力。
-
Communicating Activations Between Language Model Agents
This paper introduces Activation Communication (AC), a novel method for inter-LLM communication using intermediate activations instead of natural language, achieving up to 27% performance improvement over traditional methods with significantly reduced compute across coordination games and reasoning benchmarks.
-
Merge to Mix: Mixing Datasets via Model Merging
本文提出*Merge to Mix*方法,通过模型合并技术作为代理,高效选择数据集混合用于大型模型微调,在图像分类和语言任务中显著优于传统方法,接近甚至部分超过Oracle性能。