dLLM Unifies Diffusion Language Model Training and Inference with Open-Source Library Supporting LoRA, DeepSpeed, and Fast Decoding

Mar 03, 2026
GitHub
Article image for dLLM Unifies Diffusion Language Model Training and Inference with Open-Source Library Supporting LoRA, DeepSpeed, and Fast Decoding

Summary

A new open-source library called dLLM unifies diffusion language model training, inference, and evaluation, offering scalable pipelines with LoRA, DeepSpeed, and FSDP support, while its latest Fast-dLLM update delivers accelerated inference and tiny 0.5B/0.6B models adapted from Qwen and LLaMA.

Key Points

  • dLLM is an open-source library that unifies the training, inference, and evaluation of diffusion language models, supporting scalable pipelines with LoRA, DeepSpeed, and FSDP.
  • The library supports multiple model types and training approaches, including LLaDA, Dream, BERT-Chat, and autoregressive-to-diffusion conversion, with fully open recipes and pretrained models available on Hugging Face.
  • Recent updates introduce Fast-dLLM for accelerated inference using cache and confidence-threshold decoding, alongside tiny 0.5B/0.6B diffusion models adapted from popular autoregressive models like Qwen and LLaMA.

Tags

Read Original Article