Tag: Multimodal Systems
All the articles with the tag "Multimodal Systems".
-
R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning
本文提出R1-Reward,通过StableReinforce算法将强化学习应用于多模态奖励模型训练,显著提升了性能并在多个基准测试中超越现有最优模型,同时展示了优异的数据效率和测试时扩展性。
-
SEFE: Superficial and Essential Forgetting Eliminator for Multimodal Continual Instruction Tuning
This paper introduces SEFE, a method combining Answer Style Diversification (ASD) to mitigate superficial forgetting and RegLoRA to address essential forgetting in Multimodal Continual Instruction Tuning, achieving state-of-the-art performance on the CoIN benchmark.
-
LSAQ: Layer-Specific Adaptive Quantization for Large Language Model Deployment
LSAQ introduces a novel Layer-Specific Adaptive Quantization system for LLMs, using Jaccard similarity to assess layer importance and dynamically adjusting quantization precision based on edge device resources, achieving superior accuracy on zero-shot tasks and lower perplexity compared to baseline methods while enabling efficient deployment.
-
Reinforced MLLM: A Survey on RL-Based Reasoning in Multimodal Large Language Models
本文系统综述了基于强化学习的推理方法在多模态大语言模型(MLLMs)中的进展,分析了算法设计、奖励机制及应用,揭示了跨模态推理和奖励稀疏性等挑战,并提出了分层奖励和交互式RL等未来方向。
-
Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning
ARTIST, a novel framework unifying agentic reasoning, reinforcement learning, and tool integration, enables LLMs to autonomously orchestrate external tools within multi-turn reasoning, achieving up to 22% accuracy gains on complex math tasks and significant improvements in multi-turn function calling over baselines.