Tag: Reasoning
All the articles with the tag "Reasoning".
-
Multilingual Performance of a Multimodal Artificial Intelligence System on Multisubject Physics Concept Inventories
This exploratory study evaluates GPT-4o's multilingual and multimodal performance on physics concept inventories, revealing strong results in English and text-based tasks but significant weaknesses in visual interpretation and non-Western languages, highlighting implications for equitable AI integration in education.
-
Towards Revealing the Effectiveness of Small-Scale Fine-tuning in R1-style Reinforcement Learning
本文通过理论分析和Re-distillation技术,揭示了小规模SFT在R1风格RL中的效率瓶颈,并以极少样本(<1K)在K&K和MATH数据集上接近RL性能,显著提升了数据效率。
-
1bit-Merging: Dynamic Quantized Merging for Large Language Models
1bit-Merging提出了一种动态模型合并框架,通过1位量化任务向量和任务特定路由,在保持94.53%性能的同时将存储需求降至55.02%,在通用知识、数学推理和代码生成任务上优于传统和动态合并方法。
-
Reward Reasoning Model
本文提出奖励推理模型(RRMs),通过链式推理过程在生成奖励前自适应利用测试时计算资源,在多个奖励建模基准和实际应用中显著提升性能,尤其在复杂推理任务上表现优异。
-
TL;DR: Too Long, Do Re-weighting for Effcient LLM Reasoning Compression
本文提出TLDR方法,通过动态再加权系统1和系统2推理数据,显著压缩大型语言模型的推理输出token数量(约40%),同时在多难度数学任务上基本保持准确性。