
What you can do
Generate from text
Describe a scene and the AI creates a video clip with motion and audio.
Generate from an image
Turn a still image into an animated video clip with a text prompt.
Edit a video
Change the style, color grade, or content of a video with a prompt.
Remove background
Strip the background from a video, leaving the subject isolated.
Motion control
Transfer motion from a reference video onto a still image.
Combine into timeline
Arrange multiple clips and images into a single video output.
A→B transitions — Start and end frame control
A→B transitions — Start and end frame control
Provide both a starting image and an ending image. The AI generates a smooth transition between the two states. Use this for product reveals, before-and-after sequences, or any scene that moves from one defined state to another.
Element injection — Add characters or objects to a video
Element injection — Add characters or objects to a video
Upload a frontal image of a character or object. Reference it in your prompt with @Element1 syntax. The AI injects it into the video while keeping the original motion and scene.
Style reference — Guide the visual look with an image
Style reference — Guide the visual look with an image
Attach a style reference image. The AI matches the color palette, lighting, and visual treatment to create a consistent look across your video.
Audio generation — Native lip-sync and sound
Audio generation — Native lip-sync and sound
Enable audio during generation for dialogue, singing, or ambient sound. The AI creates synchronized audio that matches the visual content. Enabling audio costs approximately 2x the credits of a muted video.
Long video — Multi-shot sequences stitched together
Long video — Multi-shot sequences stitched together
For videos longer than 15 seconds, the agent plans individual shots, generates each clip, and stitches them together using the timeline. This requires your approval before generation since it uses more credits.
Options you can configure
Before generating, you can set:| Option | Choices |
|---|---|
| Aspect ratio | 21:9, 16:9 (default), 4:3, 1:1, 3:4, 9:16, 9:21 |
| Duration | 3–15 seconds (varies by model) |
| Resolution | 480p, 720p, 1080p, 4K (varies by model) |
| Audio | On or off. Muted videos use half the credits. |

Video generation models
Multiple AI models are available across 7 families. The agent selects the best model for your task automatically.| Family | Strengths | Text-to-video | Image-to-video | Editing | Audio |
|---|---|---|---|---|---|
| Kling 2.6/3.0 | Most versatile, A→B transitions, element injection | Yes | Yes | Yes (O1/O3) | Yes |
| Veo 3.1 | High quality, up to 4K resolution | Yes | Yes | No | Yes |
| Seedance 1.5 | Widest aspect ratio support, available on Free plan | Yes | Yes | No | Yes |
| MiniMax Hailuo | Fast generation | Yes | Yes | No | No |
| Wan v2.6 | Open-source, low cost | Yes | Yes | No | No |
Only Seedance 1.5 Pro is available on the Free plan. All other models require a paid plan.
Non-destructive editing
Every video operation (edit, motion control, background removal) creates a new video on the canvas rather than overwriting the original. Your original video is always preserved.What AI Videos does not support
- Real-time video streaming or live recording.
- Videos longer than 15 seconds from a single generation. Longer videos require multi-shot planning and stitching.
- Frame-by-frame editing or manual keyframing.
- Adding text overlays or subtitles directly. Use an external editor for post-production text.

.png?fit=max&auto=format&n=J3TfNmZhqEoKcaaO&q=85&s=468b5adb026aa33181cc81ab54ab68db)