Tag: Multimodal Systems

All the articles with the tag "Multimodal Systems".

R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning

Published: 7 May, 2025 at 08:43 AM

82.56 🤔

本文提出R1-Reward，通过StableReinforce算法将强化学习应用于多模态奖励模型训练，显著提升了性能并在多个基准测试中超越现有最优模型，同时展示了优异的数据效率和测试时扩展性。
SEFE: Superficial and Essential Forgetting Eliminator for Multimodal Continual Instruction Tuning

Published: 11 May, 2025 at 11:16 AM

79.36 🤔

This paper introduces SEFE, a method combining Answer Style Diversification (ASD) to mitigate superficial forgetting and RegLoRA to address essential forgetting in Multimodal Continual Instruction Tuning, achieving state-of-the-art performance on the CoIN benchmark.
LSAQ: Layer-Specific Adaptive Quantization for Large Language Model Deployment

Published: 13 May, 2025 at 11:21 AM

78.95 🤔

LSAQ introduces a novel Layer-Specific Adaptive Quantization system for LLMs, using Jaccard similarity to assess layer importance and dynamically adjusting quantization precision based on edge device resources, achieving superior accuracy on zero-shot tasks and lower perplexity compared to baseline methods while enabling efficient deployment.
Reinforced MLLM: A Survey on RL-Based Reasoning in Multimodal Large Language Models

Published: 7 May, 2025 at 08:42 AM

78.41 🤔

本文系统综述了基于强化学习的推理方法在多模态大语言模型（MLLMs）中的进展，分析了算法设计、奖励机制及应用，揭示了跨模态推理和奖励稀疏性等挑战，并提出了分层奖励和交互式RL等未来方向。
Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning

Published: 13 May, 2025 at 11:12 AM

76.49 🤔

ARTIST, a novel framework unifying agentic reasoning, reinforcement learning, and tool integration, enables LLMs to autonomously orchestrate external tools within multi-turn reasoning, achieving up to 22% accuracy gains on complex math tasks and significant improvements in multi-turn function calling over baselines.

Tag: Multimodal Systems

R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning

SEFE: Superficial and Essential Forgetting Eliminator for Multimodal Continual Instruction Tuning

LSAQ: Layer-Specific Adaptive Quantization for Large Language Model Deployment

Reinforced MLLM: A Survey on RL-Based Reasoning in Multimodal Large Language Models

Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning