Tag: Reasoning
All the articles with the tag "Reasoning".
-
Activation-Guided Consensus Merging for Large Language Models
本文提出Activation-Guided Consensus Merging (ACM),通过基于激活值互信息(MI)的层级权重系数调整,实现大型语言模型在Long-to-Short推理任务中的高效合并,显著减少输出冗余并提升推理精度,尤其在小规模模型上效果明显。
-
Beyond Single-Task: Robust Multi-Task Length Generalization for LLMs
本文提出Meta-RFFT框架,通过多任务规则跟随预训练和少量下游适应,显著提升了大型语言模型在未见任务上的长度泛化能力,32B模型在长度30的加法任务上达到98%准确率,超越现有长链推理模型。
-
REARANK: Reasoning Re-ranking Agent via Reinforcement Learning
本文提出REARANK,一种基于强化学习的列表式重排序代理,通过显式推理和数据增强,仅用179个标注查询即在多个信息检索基准上显著超越基线并媲美甚至超越GPT-4,尤其在推理密集型任务中表现突出。
-
LiteWebAgent: The Open-Source Suite for VLM-Based Web-Agent Applications
LiteWebAgent is an open-source suite for VLM-based web agents that bridges the gap in production-ready solutions by offering an extensible framework with decoupled action generation and grounding, advanced planning, memory, tree search, and practical deployments via Vercel and Chrome extension.
-
Thinking Out Loud: Do Reasoning Models Know When They're Right?
本文通过对比指令微调、监督微调和强化学习训练的大型推理模型,发现推理导向训练显著提升了推理任务中的准确性和校准能力,但在事实性任务中可能削弱小规模模型对知识边界的感知。