Tag: Diffusion Model

All the articles with the tag "Diffusion Model".

PICD: Versatile Perceptual Image Compression with Diffusion Rendering

Published: 15 May, 2025 at 11:10 AM

95.81 🤔

PICD introduces a versatile perceptual image compression codec using diffusion rendering with three-tiered conditioning to achieve high text accuracy and visual quality for both screen and natural images, outperforming existing methods in key metrics like FID and text accuracy.
Contaminated Multivariate Time-Series Anomaly Detection with Spatio-Temporal Graph Conditional Diffusion Models

Published: 16 May, 2025 at 11:29 AM

94.27 🤔

TSAD-C introduces a pioneering unsupervised framework for multivariate time-series anomaly detection on contaminated data, using a Decontaminator with S4-based diffusion, long-range dependency modeling via a time-then-graph approach, and anomaly scoring, achieving state-of-the-art performance across diverse datasets.
Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoning

Published: 16 May, 2025 at 11:36 AM

93.91 🤔

Selftok introduces a non-spatial autoregressive visual tokenizer using diffusion timesteps, unifying vision-language models and enabling effective reinforcement learning for superior text-to-image generation, as demonstrated on GenEval and DPG-Bench benchmarks.
Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations

Published: 8 May, 2025 at 10:22 AM

89.20 🤔

The Video Prediction Policy (VPP) introduces a novel generalist robot policy that leverages predictive visual representations from fine-tuned video diffusion models to learn implicit inverse dynamics, achieving significant improvements of 41.5% on the Calvin ABC→D benchmark and 31.6% in real-world dexterous manipulation tasks over state-of-the-art baselines.
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective

Published: 26 May, 2025 at 11:23 AM

85.84 🤔

本文提出基于扩散语言模型的文本嵌入方法DIFFEMBED，利用其双向注意力机制在长文档检索和推理密集型任务上显著优于自回归LLM嵌入模型，同时在传统嵌入任务上表现相当。

Tag: Diffusion Model

PICD: Versatile Perceptual Image Compression with Diffusion Rendering

Contaminated Multivariate Time-Series Anomaly Detection with Spatio-Temporal Graph Conditional Diffusion Models

Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoning

Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations

Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective