Posts

  • SQA-032

    FasterDiT: Towards Faster Diffusion Transformers Training without Architecture Modification [paper]

  • SQA-031

    Guiding a Diffusion Model with a Bad Version of Itself [paper]

  • SQA-030

    Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models [paper]

  • SQA-029

    Group Normalization [paper]

  • SQA-028

    Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift [paper]

  • SQA-027

    Transformers without Normalization [paper]

  • SQA-026

    Analyzing and Improving the Training Dynamics of Diffusion Models [paper]

  • SQA-025

    GIVT: Generative Infinite-Vocabulary Transformers [paper]

  • SQA-024

    Jet: A Modern Transformer-Based Normalizing Flow [paper]

  • SQA-023

    JetFormer: An Autoregressive Generative Model Of Raw Images And Text [paper]

  • SQA-022

    CLIP: Learning Transferable Visual Models From Natural Language Supervision [paper]

  • SQA-021

    Deep Equilibrium Approaches to Diffusion Models [paper]

  • SQA-020

    PixelFlow: Pixel-Space Generative Models with Flow [paper]

  • WXB-005

    Mamba: Linear-Time Sequence Modeling with Selective State Spaces [paper]

  • JZC-012

    Scaling Vision with Sparse Mixture of Experts [paper]

  • JZC-011

    The Impact of Initialization on LoRA Finetuning Dynamics [paper]

  • JZC-010

    LoRA: Low-Rank Adaptation of Large Language Models [paper]

  • JZC-009

    Deepseekmoe: Towards ultimate expert specialization in mixture-of-experts language models [paper]

  • JZC-008

    S^4-Tuning: A Simple Cross-lingual Sub-network Tuning Method [paper]

  • JZC-007

    Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning [paper]

  • SQA-019

    Momentum Contrast for Unsupervised Visual Representation Learning [paper]

  • SQA-018

    Scaling Rectified Flow Transformers for High-Resolution Image Synthesis [paper]

  • SQA-017

    Normalizing Flows are Capable Generative Models [paper]

  • WXB-004

    Understanding Diffusion Objectives as the ELBO with Simple Data Augmentation [paper]

  • WXB-003

    Neural Ordinary Differential Equations [paper]

  • WXB-002

    One Step Diffusion via Shortcut Models [paper]

  • WXB-001

    The Road Less Scheduled [paper]

  • ZHH-017

    MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis [paper]

  • SQA-016

    A Simple Framework for Contrastive Learning of Visual Representations [paper]

  • ZHH-016

    Masked Autoencoders Are Scalable Vision Learners [paper]

  • ZHH-015

    Simplifying, Stabilizing and Scaling Continuous-Time Consistency Models [paper]

  • SQA-015

    DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning [paper]

  • SQA-014

    Emerging Properties in Self-Supervised Vision Transformers [paper]

  • SQA-013

    Scalable Diffusion Models with Transformers [paper]

  • SQA-012

    Consistency Model Made Easy [paper]

  • ZHH-014

    Diffusion Models Beat GANs on Image Synthesis [paper]

  • SQA-011

    IMPROVED TECHNIQUES FOR TRAINING CONSISTENCY MODELS [paper]

  • SQA-010

    Consistency models [paper]

  • SQA-009

    SCORE-BASED GENERATIVE MODELING THROUGH STOCHASTIC DIFFERENTIAL EQUATIONS [paper]

  • SQA-008

    DDIM: Denoising Diffusion Implicit Models [paper]

  • SQA-007

    A Connection Between Score Matching and Denoising Autoencoders [paper]

  • SQA-006

    Elucidating the Design Space of Diffusion-BasedGenerative Models [paper]

  • SQA-005

    SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers [paper]

  • JZC-006

    Denoising Diffusion Implicit Models [paper]

  • SQA-004

    BUILDING NORMALIZING FLOWS WITH STOCHASTIC INTERPOLANTS [paper]

  • SQA-003

    simple diffusion: End-to-end diffusion for high resolution images [paper]

  • SQA-002

    PROGRESSIVE DISTILLATION FOR FAST SAMPLING OF DIFFUSION MODELS [paper]

  • SQA-001

    Classifier-Free Diffusion Guidance [paper]

  • ZHH-013

    Flow Matching for Generative Modeling [paper]

  • ZHH-012

    Show-o: One Single Transformer to Unify Multimodal Understanding and Generation [paper]

  • ZHH-011

    Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model [paper]

  • ZHH-010

    Autoregressive Image Generation without Vector Quantization [paper]

  • ZHH-009

    Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation [paper]

  • ZHH-008

    Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction [paper]

  • ZHH-007

    An Image is Worth 32 Tokens for Reconstrcution and Generation [paper]

  • ZHH-006

    Improved Variational Inference with Inverse Autoregressive Flow [paper]

  • ZHH-005

    Pixel Recurent Neural Networks [paper]

  • ZHH-004

    Invertible Residual Networks [paper]

  • ZHH-003

    Glow: Generative Flow with Invertible 1x1 Convolutions [paper]

  • ZHH-002

    Variational Inference with Normalizing Flows [paper]

  • ZHH-001

    Deep Image Prior [paper]

  • JZC-005

    Cold Diffusion: Inverting Arbitrary Image Transforms Without Noise [paper]

  • JZC-004

    Structural Biases for Improving Transformers on Translation into Morphologically Rich Languages [paper]

  • JZC-003

    Enhancing the Transformer With Explicit Relational Encoding for Math Problem Solving [paper]

  • JZC-002

    Learning to Reason with Third-Order Tensor Products [paper]

  • JZC-001

    LifeGPT: Topology-Agnostic Generative Pretrained Transformer Model for Cellular Automata [paper]