Tencent AI Lab Launches Penguin-VL: A Compact Vision-Language Model That Ditches Traditional Visual Encoders for LLM-Based Architecture
Tencent AI Lab launches Penguin-VL, a compact vision-language model that ditches traditional visual encoders in favor of a text-LLM-initialized architecture, delivering stronger fine-grained visual understanding and efficient long-video processing, with two model variants now live on Hugging Face.