Posts
SQA-032
FasterDiT: Towards Faster Diffusion Transformers Training without Architecture Modification [paper]
SQA-031
Guiding a Diffusion Model with a Bad Version of Itself [paper]
SQA-030
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models [paper]
SQA-029
Group Normalization [paper]
SQA-028
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift [paper]
SQA-027
Transformers without Normalization [paper]
SQA-026
Analyzing and Improving the Training Dynamics of Diffusion Models [paper]
SQA-025
GIVT: Generative Infinite-Vocabulary Transformers [paper]
SQA-024
Jet: A Modern Transformer-Based Normalizing Flow [paper]
SQA-023
JetFormer: An Autoregressive Generative Model Of Raw Images And Text [paper]
SQA-022
CLIP: Learning Transferable Visual Models From Natural Language Supervision [paper]
SQA-021
Deep Equilibrium Approaches to Diffusion Models [paper]
SQA-020
PixelFlow: Pixel-Space Generative Models with Flow [paper]
WXB-005
Mamba: Linear-Time Sequence Modeling with Selective State Spaces [paper]
JZC-012
Scaling Vision with Sparse Mixture of Experts [paper]
JZC-011
The Impact of Initialization on LoRA Finetuning Dynamics [paper]
JZC-010
LoRA: Low-Rank Adaptation of Large Language Models [paper]
JZC-009
Deepseekmoe: Towards ultimate expert specialization in mixture-of-experts language models [paper]
JZC-008
S^4-Tuning: A Simple Cross-lingual Sub-network Tuning Method [paper]
JZC-007
Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning [paper]
SQA-019
Momentum Contrast for Unsupervised Visual Representation Learning [paper]
SQA-018
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis [paper]
SQA-017
Normalizing Flows are Capable Generative Models [paper]
WXB-004
Understanding Diffusion Objectives as the ELBO with Simple Data Augmentation [paper]
WXB-003
Neural Ordinary Differential Equations [paper]
WXB-002
One Step Diffusion via Shortcut Models [paper]
WXB-001
The Road Less Scheduled [paper]
ZHH-017
MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis [paper]
SQA-016
A Simple Framework for Contrastive Learning of Visual Representations [paper]
ZHH-016
Masked Autoencoders Are Scalable Vision Learners [paper]
ZHH-015
Simplifying, Stabilizing and Scaling Continuous-Time Consistency Models [paper]
SQA-015
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning [paper]
SQA-014
Emerging Properties in Self-Supervised Vision Transformers [paper]
SQA-013
Scalable Diffusion Models with Transformers [paper]
SQA-012
Consistency Model Made Easy [paper]
ZHH-014
Diffusion Models Beat GANs on Image Synthesis [paper]
SQA-011
IMPROVED TECHNIQUES FOR TRAINING CONSISTENCY MODELS [paper]
SQA-010
Consistency models [paper]
SQA-009
SCORE-BASED GENERATIVE MODELING THROUGH STOCHASTIC DIFFERENTIAL EQUATIONS [paper]
SQA-008
DDIM: Denoising Diffusion Implicit Models [paper]
SQA-007
A Connection Between Score Matching and Denoising Autoencoders [paper]
SQA-006
Elucidating the Design Space of Diffusion-BasedGenerative Models [paper]
SQA-005
SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers [paper]
JZC-006
Denoising Diffusion Implicit Models [paper]
SQA-004
BUILDING NORMALIZING FLOWS WITH STOCHASTIC INTERPOLANTS [paper]
SQA-003
simple diffusion: End-to-end diffusion for high resolution images [paper]
SQA-002
PROGRESSIVE DISTILLATION FOR FAST SAMPLING OF DIFFUSION MODELS [paper]
SQA-001
Classifier-Free Diffusion Guidance [paper]
ZHH-013
Flow Matching for Generative Modeling [paper]
ZHH-012
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation [paper]
ZHH-011
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model [paper]
ZHH-010
Autoregressive Image Generation without Vector Quantization [paper]
ZHH-009
Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation [paper]
ZHH-008
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction [paper]
ZHH-007
An Image is Worth 32 Tokens for Reconstrcution and Generation [paper]
ZHH-006
Improved Variational Inference with Inverse Autoregressive Flow [paper]
ZHH-005
Pixel Recurent Neural Networks [paper]
ZHH-004
Invertible Residual Networks [paper]
ZHH-003
Glow: Generative Flow with Invertible 1x1 Convolutions [paper]
ZHH-002
Variational Inference with Normalizing Flows [paper]
ZHH-001
Deep Image Prior [paper]
JZC-005
Cold Diffusion: Inverting Arbitrary Image Transforms Without Noise [paper]
JZC-004
Structural Biases for Improving Transformers on Translation into Morphologically Rich Languages [paper]
JZC-003
Enhancing the Transformer With Explicit Relational Encoding for Math Problem Solving [paper]
JZC-002
Learning to Reason with Third-Order Tensor Products [paper]
JZC-001
LifeGPT: Topology-Agnostic Generative Pretrained Transformer Model for Cellular Automata [paper]