Tag: Multi-Agent

All the articles with the tag "Multi-Agent".

RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning

Published: 7 May, 2025 at 12:11 AM

88.02 🤔

本文提出StarPO框架和RAGEN系统，通过多轮轨迹级别强化学习训练LLM智能体，揭示了训练不稳定性（如Echo Trap）和推理能力不足的挑战，并通过StarPO-S改进稳定性和泛化性，但推理能力仍需细粒度奖励设计支持。
EMORL: Ensemble Multi-Objective Reinforcement Learning for Efficient and Flexible LLM Fine-Tuning

Published: 7 May, 2025 at 09:32 AM

87.79 🤔

本文提出EMORL框架，通过集成学习分别训练单目标模型并在隐藏状态层聚合，结合分层网格搜索优化权重，在咨询反思生成任务中实现了与传统方法相当的性能，同时显著提升了训练效率、可扩展性和解释性。
MaskSearch: A Universal Pre-Training Framework to Enhance Agentic Search Capability

Published: 30 May, 2025 at 11:19 AM

87.72 🤔

本文提出 MASKSEARCH 框架，通过 Retrieval-Augmented Mask Prediction (RAMP) 预训练任务结合监督微调和强化学习，显著提升了大型语言模型在开放域多跳问答任务中的代理搜索能力。
Communicating Activations Between Language Model Agents

Published: 10 May, 2025 at 10:59 AM

87.71 🤔

This paper introduces Activation Communication (AC), a novel method for inter-LLM communication using intermediate activations instead of natural language, achieving up to 27% performance improvement over traditional methods with significantly reduced compute across coordination games and reasoning benchmarks.
AI agents may be worth the hype but not the resources (yet): An initial exploration of machine translation quality and costs in three language pairs in the legal and news domains

Published: 8 May, 2025 at 12:22 AM

86.86 🤔

本文通过实证评估五种机器翻译范式，发现推理增强的大型语言模型（如o1-preview）在人工评估中表现出色，超越传统NMT，而多智能体系统虽具潜力，但因高计算成本和语言对表现不一致而受限。

Tag: Multi-Agent

RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning

EMORL: Ensemble Multi-Objective Reinforcement Learning for Efficient and Flexible LLM Fine-Tuning

MaskSearch: A Universal Pre-Training Framework to Enhance Agentic Search Capability

Communicating Activations Between Language Model Agents

AI agents may be worth the hype but not the resources (yet): An initial exploration of machine translation quality and costs in three language pairs in the legal and news domains