Tag: Scaling Laws

All the articles with the tag "Scaling Laws".

Vectors from Larger Language Models Predict Human Reading Time and fMRI Data More Poorly when Dimensionality Expansion is Controlled

Published: 23 May, 2025 at 11:10 AM

85.71 🤔

本文通过控制维度扩展发现，大型语言模型（LLMs）在预测人类阅读时间和脑成像数据时，随着模型规模增加，训练过程的贡献反而减少，揭示了模型与人类句子处理机制的潜在错位。
When More is Less: Understanding Chain-of-Thought Length in LLMs

Published: 30 May, 2025 at 11:22 AM

85.45 🤔

本文通过理论分析、控制实验和现实观察，揭示Chain-of-Thought (CoT) 长度与推理性能呈倒U型关系，提出最优长度随任务难度增加和模型能力增强而变化的缩放规律，并展示了基于最优长度的训练和推理策略的显著性能提升。
Explaining Context Length Scaling and Bounds for Language Models

Published: 26 May, 2025 at 11:40 AM

85.38 🤔

本文从内在空间视角提出理论框架，解释上下文长度对语言模型损失的影响，推导出与数据集大小相关的最优上下文长度，并通过自然语言和合成数据实验验证假设。
Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon

Published: 10 May, 2025 at 10:59 AM

85.36 🤔

This paper introduces a taxonomy of language model memorization into recitation, reconstruction, and recollection, demonstrating through experiments with Pythia models that different factors influence each category, with a taxonomy-based predictive model outperforming baselines in predicting memorization likelihood.
Scalable Complexity Control Facilitates Reasoning Ability of LLMs

Published: 3 Jun, 2025 at 11:29 AM

85.16 🤔

本文通过调整初始化率和权重衰减系数控制大语言模型复杂性，显著提升推理能力，尤其在数学任务上表现突出，并在扩展律上展现更优性能。

Tag: Scaling Laws

Vectors from Larger Language Models Predict Human Reading Time and fMRI Data More Poorly when Dimensionality Expansion is Controlled

When More is Less: Understanding Chain-of-Thought Length in LLMs

Explaining Context Length Scaling and Bounds for Language Models

Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon

Scalable Complexity Control Facilitates Reasoning Ability of LLMs