
| Developer | Zhipu AI |
| Origin | China |
| Released | CogVideoX 1.5 · Nov 2024 |
| Native audio | Add voiceover |
| Max resolution | 1360×768 |
| Max clip length | 6–10s |
| Availability | Open weights · Open weights (2B: Apache 2.0) |
| Available via | Open weights · Z.ai API |
| Pricing from | Free (open) · API |
CogVideoX is best for quick, flexible text- and image-to-video where an accessible, open model is a good fit.
Background. From Zhipu AI and Tsinghua's THUDM, CogVideoX (1.5, November 2024) is one of the most-forked early open video models, runnable on consumer hardware with an ICLR-accepted design.
Open Vivideo and start a new video.
Select CogVideoX as your model — or let the Video Agent pick it for you.
Describe your shot (or upload an image) and set duration and aspect ratio.
Generate with CogVideoX, then refine, add a voice or avatar, and export for any platform.
Each model is one of 30+ in Vivideo — switch per shot to get exactly the look you want.





State-of-the-art motion, realism and native audio.



CogVideoX, from Zhipu AI, is a widely used open-source video model supporting text- and image-to-video. Its openness has made it a community favorite for experimentation.
CogVideoX suits creators who want a flexible, accessible engine for everyday clips and quick iterations across both text and image inputs.
On Vivideo, CogVideoX is one of 30+ models on a single subscription — use it as a flexible everyday option, then escalate to Veo or Sora when a shot needs extra polish, all in one project.
You can try CogVideoX free on Vivideo to start — no credit card. Heavier use and premium models are covered by a paid plan.
CogVideoX is an accessible open model that handles both text-to-video and image-to-video well, making it a flexible everyday option.
Yes — on Vivideo you can use CogVideoX for both text-to-video and image-to-video, and switch to other models per shot.
Yes — Vivideo lets you switch models per shot, so you can mix CogVideoX with other engines in one project.