Tag: Representation Learning

All the articles with the tag "Representation Learning".

When Does Metadata Conditioning (NOT) Work for Language Model Pre-Training? A Study with Context-Free Grammars

Published: 4 May, 2025 at 04:30 PM

69.59 🤔

本论文通过上下文无关文法合成数据研究了元数据条件化在语言模型预训练中的影响，发现其对长提示任务有益但对短提示任务有害，揭示了潜在语义推断的权衡。
Beyond Single Concept Vector: Modeling Concept Subspace in LLMs with Gaussian Distribution

Published: 12 May, 2025 at 11:18 AM

69.15 🤔

This paper introduces Gaussian Concept Subspace (GCS), a framework to model concept representations in LLMs as Gaussian distributions, demonstrating improved robustness, faithfulness, and plausibility over single vector methods, with effective application in emotion steering tasks.
Training Plug-n-Play Knowledge Modules with Deep Context Distillation

Published: 4 May, 2025 at 04:28 PM

69.06 🤔

本文提出使用深度上下文蒸馏训练可插拔知识模块的方法，能够在低数据场景下高效整合文档知识，并通过实验证明其在问答任务中优于传统方法且与 RAG 具有协同效应。
Contextures: Representations from Contexts

Published: 10 May, 2025 at 11:05 AM

69.00 🤔

This paper introduces the contexture theory, unifying representation learning across paradigms by targeting top singular functions of a context-induced expectation operator, demonstrating high alignment in neural representations and proposing a task-agnostic metric for context evaluation with strong empirical correlation to performance on various datasets.
HINT: Hypernetwork Approach to Training Weight Interval Regions in Continual Learning

Published: 13 May, 2025 at 11:21 AM

65.63 🤔

HINT proposes a continual learning framework using interval arithmetic in embedding space with a hypernetwork to generate target network weights, achieving improved scalability and non-forgetting guarantees over InterContiNet while outperforming several benchmarks, though struggling with complex datasets.

Tag: Representation Learning

When Does Metadata Conditioning (NOT) Work for Language Model Pre-Training? A Study with Context-Free Grammars

Beyond Single Concept Vector: Modeling Concept Subspace in LLMs with Gaussian Distribution

Training Plug-n-Play Knowledge Modules with Deep Context Distillation

Contextures: Representations from Contexts

HINT: Hypernetwork Approach to Training Weight Interval Regions in Continual Learning