dLLM Unifies Diffusion Language Model Training and Inference with Open-Source Library Supporting LoRA, DeepSpeed, and Fast Decoding
Summary
A new open-source library called dLLM unifies diffusion language model training, inference, and evaluation, offering scalable pipelines with LoRA, DeepSpeed, and FSDP support, while its latest Fast-dLLM update delivers accelerated inference and tiny 0.5B/0.6B models adapted from Qwen and LLaMA.
Key Points
- dLLM is an open-source library that unifies the training, inference, and evaluation of diffusion language models, supporting scalable pipelines with LoRA, DeepSpeed, and FSDP.
- The library supports multiple model types and training approaches, including LLaDA, Dream, BERT-Chat, and autoregressive-to-diffusion conversion, with fully open recipes and pretrained models available on Hugging Face.
- Recent updates introduce Fast-dLLM for accelerated inference using cache and confidence-threshold decoding, alongside tiny 0.5B/0.6B diffusion models adapted from popular autoregressive models like Qwen and LLaMA.