Tag: Representation Learning

All the articles with the tag "Representation Learning".

MMRL++: Parameter-Efficient and Interaction-Aware Representation Learning for Vision-Language Models

Published: 19 May, 2025 at 11:16 AM

73.67 🤔

本文提出MMRL及MMRL++框架，通过共享表示空间和解耦策略增强视觉-语言模型的少样本适配能力，并利用参数高效的SRRA和PRC机制提升泛化性和训练稳定性，在多个数据集上取得最优性能。
RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference

Published: 9 May, 2025 at 11:06 AM

73.12 🤔

RetroInfer reimagines the KV cache as a vector storage system, using an attention-aware wave index and wave buffer to achieve up to 4.5x speedup over full attention and 10.5x over sparse baselines for long-context LLM inference, while preserving near-full-attention accuracy.
Patterns and Mechanisms of Contrastive Activation Engineering

Published: 13 May, 2025 at 11:12 AM

71.25 🤔

This paper systematically investigates Contrastive Activation Engineering (CAE) for steering LLM behavior at inference time, revealing reliable in-distribution performance with optimal sample sizes around 80-100, but significant challenges in out-of-distribution generalization, model perplexity degradation, and vulnerability to adversarial inputs.
Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs

Published: 4 May, 2025 at 04:29 PM

69.81 🤔

本文提出UniME框架，通过文本判别知识蒸馏和硬负例增强指令微调，利用多模态大语言模型学习通用的多模态嵌入，提高了下游任务的判别性和组合能力。
Intra-Layer Recurrence in Transformers for Language Modeling

Published: 7 May, 2025 at 12:12 AM

69.79 🤔

本文提出Intra-Layer Recurrence (ILR)方法，通过在Transformer单次前向传播中选择性循环特定层（尤其是早期层），在不增加参数量的情况下改善语言建模困惑度，但计算成本增加和大规模模型验证不足限制了其实用性。

Tag: Representation Learning

MMRL++: Parameter-Efficient and Interaction-Aware Representation Learning for Vision-Language Models

RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference

Patterns and Mechanisms of Contrastive Activation Engineering

Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs

Intra-Layer Recurrence in Transformers for Language Modeling