Published on

Don’t Miss Qwen 2.5 VL!

Authors

Despite all the Deepseek Hype, Qwen just dropped the best open Multimodal Model!

Video

Don’t Miss Qwen 2.5 VL! Despite all the Deepseek Hype, Qwen just dropped the best open Multimodal Model! Qwen 2.5 VL is a Vision Language Model that can control your computer, similar to the OpenAI operator, extract structured information from charts, and more!

TL;DR; 3️⃣ Available in 3 sizes: 3B, 7B, and 72B parameters 🧬 Uses Qwen 2.5 as text backbone 🎯 Agent capabilities for direct computer and phone use 🧠 Improved visual understanding of texts, charts, icons, graphics, and layouts 🎥 Extended video support of 1+ hour 📊 Structured output for financial/commercial documents 💡 Sota on multiple benchmarks, DocVQA, TextVQA, ScreenSpot, Android Control 📦 Apache 2.0 licensed (except 72B) and available on Hugging Face

Author

AiUTOMATING PEOPLE, ABN ASIA was founded by people with deep roots in academia, with work experience in the US, Holland, Hungary, Japan, South Korea, Singapore, and Vietnam. ABN Asia is where academia and technology meet opportunity. With our cutting-edge solutions and competent software development services, we're helping businesses level up and take on the global scene. Our commitment: Faster. Better. More reliable. In most cases: Cheaper as well.

Feel free to reach out to us whenever you require IT services, digital consulting, off-the-shelf software solutions, or if you'd like to send us requests for proposals (RFPs). You can contact us at contact@abnasia.org. We're ready to assist you with all your technology needs.

ABNAsia.org

© ABN ASIA

AbnAsia.org Software