Posts
All the articles I've posted.
-
Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations
The Video Prediction Policy (VPP) introduces a novel generalist robot policy that leverages predictive visual representations from fine-tuned video diffusion models to learn implicit inverse dynamics, achieving significant improvements of 41.5% on the Calvin ABC→D benchmark and 31.6% in real-world dexterous manipulation tasks over state-of-the-art baselines.
-
Always Skip Attention
This paper theoretically demonstrates the ill-conditioning of Self-Attention Blocks in Vision Transformers without skip connections, highlights their role as regularizers, and proposes Token Graying (SVD and DCT) to improve input token conditioning, achieving modest performance gains in supervised and self-supervised tasks.
-
Graph Attention is Not Always Beneficial: A Theoretical Analysis of Graph Attention Mechanisms via Contextual Stochastic Block Models
This paper provides a theoretical analysis using Contextual Stochastic Block Models to demonstrate that graph attention mechanisms are beneficial for node classification only when structure noise exceeds feature noise, proposes a multi-layer GAT to achieve perfect classification at lower SNR thresholds, and validates these findings through synthetic and real-world experiments.
-
Facets of Disparate Impact: Evaluating Legally Consistent Bias in Machine Learning
This paper introduces the Objective Fairness Index (OFI), a legally grounded metric for evaluating bias in machine learning by comparing marginal benefits across groups, demonstrating its ability to detect algorithmic bias in applications like COMPAS and Folktable's Adult Employment dataset where traditional Disparate Impact fails.
-
Model Merging in Pre-training of Large Language Models
本文提出预训练模型平均(PMA)策略,通过融合预训练阶段的检查点显著提升大型语言模型性能、预测退火效果并增强训练稳定性,为高效模型开发提供了新方法和实用指南。