qwen-vlo
by qwen
Pricing
Input
$0.00 / 1M tokens
Output
$0.02 / 1M tokens
Qwen VLo – Multimodal Text-to-Image & Editing Power
Overview
Qwen VLo is Alibaba Cloud’s open-source text-to-image and image-editing model, designed for interactive visual generation with intelligent refinement. As of the latest release (Qwen 2.5 Max, July 2025), it supports:
- Seamless progressive image generation – watch the image build step-by-step
- Powerful image editing via natural language, including background and object replacement
- Multilingual prompts and integration with Qwen Chat and API
Features
- Progressive generation – see image evolution from sketch to detail
- Interactive editing – modify objects, scenes, lighting, and composition via prompts
- Multilingual input – fluent in English and Chinese text instructions
- Instructable images – edit or expand generated images with follow-up commands
- API-ready – accessible through Alibaba Cloud for app integration
- Free to use – available via Qwen Chat with no login required
Release Timeline
Version | Release Date | Highlights |
---|---|---|
Qwen-VL | Sept 2023 | Multimodal understanding (VQA, captioning) |
Qwen2-VL | Jan 2024 | Dynamic image resolution support |
Qwen2.5-VL | Apr 2025 | Text-in-image, OCR, and document parsing |
Qwen VLo | July 2025 | Text-to-image & editing, progressive gen |
Use Cases
- Content Creation: posters, concept art, marketing visuals via prompt
- Design Iteration: adjust or refine visuals interactively with natural language
- Multilingual Visual Tools: generate or modify images in both English and Chinese
- Workflow Integration: deploy within creative pipelines using Alibaba Cloud API
Strengths & Limitations
-
Strengths
- Progressive generation with live visual updates
- Rich image editing from natural-language instructions
- Multilingual prompt understanding with open accessibility
-
Limitations
- Less style variety compared to Midjourney or Ideogram
- Advanced edits may require prompt tuning or retries