Tag: Efficiency
All the articles with the tag "Efficiency".
-
LoRE-Merging: Exploring Low-Rank Estimation For Large Language Model Merging
本文提出LORE-MERGING框架,通过低秩估计构建近似基础模型和任务向量,无需访问原始基础模型即可实现模型合并,并在多个基准数据集上展现出优于传统方法的性能。
-
Long-Short Chain-of-Thought Mixture Supervised Fine-Tuning Eliciting Efficient Reasoning in Large Language Models
This paper introduces Long-Short Chain-of-Thought Mixture Supervised Fine-Tuning (LS-Mixture SFT), which combines long and short CoT datasets to fine-tune non-reasoning LLMs, achieving a 2.3% average accuracy improvement and 47.61% response length reduction on reasoning benchmarks.
-
GCN-Based Throughput-Oriented Handover Management in Dense 5G Vehicular Networks
This paper introduces TH-GCN, a Graph Convolutional Network-based approach for handover management in dense 5G vehicular networks, which models dynamic network conditions to reduce handovers by up to 78% and improve signal quality and throughput through real-time, topology-aware decisions.
-
Beyond Output Matching: Bidirectional Alignment for Enhanced In-Context Learning
本文提出双向对齐(BiAlign)方法,通过对齐学生模型与教师模型的令牌级输出分布和输入偏好,显著提升了学生模型的上下文学习能力,并在多种任务上取得了优于基线的结果。
-
ZeroTuning: Unlocking the Initial Token's Power to Enhance Large Language Models Without Training
ZeroTuning提出了一种无需训练的方法,通过调整大型语言模型初始token的注意力分布,在文本分类、问答和多轮对话任务中显著提升性能,同时展现出对资源限制和长上下文的鲁棒性。