dLLM Unifies Diffusion Language Model Training and Inference with Open-Source Library Supporting LoRA, DeepSpeed, and Fast Decoding

Mar 03, 2026

GitHub

Article image for dLLM Unifies Diffusion Language Model Training and Inference with Open-Source Library Supporting LoRA, DeepSpeed, and Fast Decoding

Summary

A new open-source library called dLLM unifies diffusion language model training, inference, and evaluation, offering scalable pipelines with LoRA, DeepSpeed, and FSDP support, while its latest Fast-dLLM update delivers accelerated inference and tiny 0.5B/0.6B models adapted from Qwen and LLaMA.

Key Points

dLLM is an open-source library that unifies the training, inference, and evaluation of diffusion language models, supporting scalable pipelines with LoRA, DeepSpeed, and FSDP.
The library supports multiple model types and training approaches, including LLaDA, Dream, BERT-Chat, and autoregressive-to-diffusion conversion, with fully open recipes and pretrained models available on Hugging Face.
Recent updates introduce Fast-dLLM for accelerated inference using cache and confidence-threshold decoding, alongside tiny 0.5B/0.6B diffusion models adapted from popular autoregressive models like Qwen and LLaMA.

dLLM Unifies Diffusion Language Model Training and Inference with Open-Source Library Supporting LoRA, DeepSpeed, and Fast Decoding

Summary

Key Points

Tags