Tag: Large Language Model

All the articles with the tag "Large Language Model".

Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models

Published: 8 May, 2025 at 10:22 AM

97.91 😐

Insight-V introduces a scalable data generation pipeline and a multi-agent system with iterative DPO training to significantly enhance long-chain visual reasoning in MLLMs, achieving up to 7.0% performance gains on challenging benchmarks while maintaining perception capabilities.
A Survey on Test-Time Scaling in Large Language Models: What, How, Where, and How Well?

Published: 6 May, 2025 at 11:19 PM

90.65 😐

本文通过提出一个四维度分类框架（什么扩展、如何扩展、哪里扩展、扩展效果如何），系统综述了测试时扩展（TTS）在大型语言模型中的研究现状，为理解和应用推理阶段计算扩展提供了结构化视角和实践指导。
Between Underthinking and Overthinking: An Empirical Study of Reasoning Length and correctness in LLMs

Published: 6 May, 2025 at 01:18 AM

89.54 😐

本文通过实证研究发现，大型语言模型在推理任务中存在"过度思考"简单问题和"思考不足"困难问题的现象，其推理长度与正确性呈非单调关系，且简单偏好更短回答可在保持准确率的同时显著减少生成长度。
Weight Ensembling Improves Reasoning in Language Models

Published: 6 May, 2025 at 01:27 AM

88.15 😐

本文发现监督微调导致推理模型多样性坍塌损害 Pass@K，并提出通过插值早期与后期 SFT 检查点（WiSE-FT）的方法，有效提升模型多样性，同时提高 Pass@1 和 Pass@K，进而改善测试时缩放和强化学习效果。
Param$Δ$ for Direct Weight Mixing: Post-Train Large Language Model at Zero Cost

Published: 4 May, 2025 at 04:30 PM

86.83 😐

本文提出Param∆方法，通过直接添加参数差值在零成本下实现后训练知识向新基模型的转移，达到与传统后训练相当的性能。

Tag: Large Language Model

Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models

A Survey on Test-Time Scaling in Large Language Models: What, How, Where, and How Well?

Between Underthinking and Overthinking: An Empirical Study of Reasoning Length and correctness in LLMs

Weight Ensembling Improves Reasoning in Language Models

Param$Δ$ for Direct Weight Mixing: Post-Train Large Language Model at Zero Cost