Midjourney just launched image-to-video generation and it's mind-blowing
Excellent and accessible AI video creation for all
This article has been published ahead of schedule to provide you with tips for shortening your learning curve to master the new Midjourney feature.
A new kind of image-to-video model
Midjourney dropped something HUGE today: their brand new image-to-video feature. After years of waiting (literally), we can finally turn our static creations into moving videos, and honestly, the results look REALLY impressive.
Unlike other previous major launches (in particular V7), there have been no hiccups or disasters so far.
I've been testing this feature since it went live, and while it's still version 1, the quality is already solid enough to get excited about. Good coherence, smooth fast action, and most importantly, the Midjourney aesthetics.
Let me walk you through everything you need to know.
Celebrating the joy of seeing my imaginary characters come to life!
Watch on YouTube here
What this feature actually does
The new tool takes any image – whether it's from Midjourney or uploaded from elsewhere – and creates a 5-second video. You can then extend that video up to 21 seconds total, which gives you decent flexibility for most projects.
What makes this interesting is how smooth the movements are. I expected some jittery, obvious AI artifacts, but the motion feels surprisingly natural. The team clearly spent time making sure things move in believable ways.
This is the generate video. Here’s the prompt:
the character sprinting forward and pointing gun, ready to fight, leaving the dust flying behind as he moves --ar 2:3 --motion high --video 1
This is the original image. Just one image! Smooth actions for both the main character and the characters behind.
(Find 300 keywords to create your sci-fi character here.)
How to use
Hover your mouse on any image and click Animate. It’s that easy.
Four modes to choose from
Midjourney gives you four different approaches to video creation:
Auto modes:
Auto low motion: The system decides how things should move, keeping movements subtle and controlled
Auto high motion: Everything moves more dramatically, including camera movements
Manual modes:
Manual low motion: You write prompts describing how you want things to move, with gentle results
Manual high motion: Your prompts control more dynamic movements and camera work
The automatic version works great when you just want to see your image come alive without much thought. The manual version becomes useful when you have specific ideas about how things should move.
Motion behavior differences
Low motion settings:
Camera stays mostly stationary
Subjects move slowly and deliberately
Good for portraits or scenes where you want subtle life
High motion settings:
Everything moves, including camera angles
More dramatic and cinematic effects
Risk of unrealistic or glitchy movements if pushed too far
I've found low motion works better if you just want subtle movement. High motion can create stunning results, but it's also more likely to produce weird artifacts.
But I use mostly high motion because... it's so cool. Never mind the artifacts. There are four options anyway, you can pick the best one with lesser or no artifact.
Cost and availability breakdown
This feature lives exclusively on midjourney.com – you can't access it through Discord. Here's what you need to know about pricing:
Video cost: About 8 times more GPU time than regular images
Per second: Roughly equivalent to 1 image generation per second of video
Comparison: About 25X cheaper than what other AI video generators are charging (according to Midjourney)
Extension: Up to 21 seconds (4 times 5 seconds extension are supposedly 20 seconds in total, but hey, we got 21 seconds at the end)
Relax mode: For Pro and Mega subscribers
If you're on a basic plan, those fast hours will disappear quickly. Consider upgrading if you plan to use this feature regularly.
Technical parameters you can use
Motion control parameters
--motion low: Subtle movements (this is the default)
--motion high: More dramatic action
--raw: Gives you precise motion control
Other parameter
--video 1: Specifies version 1 of the video feature (we can expect Midjourney is going to launch other even better video versions in the future)
Important notes:
Starting aspect ratios might change in the final video
Output quality is 480p (standard definition)
Any image parameters get automatically removed during video generation
You can use third-party Upscalers to improve video quality
Navigation tips that actually help
The interface takes some getting used to. Here are the tricks I've learned:
Playing videos:
Hover your mouse over the video thumbnail to play (or click on it to enlarge)
Hold Ctrl/Command while moving your mouse to scrub (move quickly) through the video frames
Like this:
Finding your original image:
After generating a video, scrolling down to find the original images to change the prompt or image (especially for older images) can be tedious
Two ways to edit the prompt:
(a) Click everything on the side panel to edit
(b) Use the thumbnail to get to other similar images in the grid
Hover over the video to reveal the original image thumbnail
Click that thumbnail to return to the source image
Pro Tip: This last tip saves a lot of scrolling through your gallery.
Saving and exporting the video file
You can save the video as is for video editing purposes or "Save for Social Media" to lessen the video's over-compression by social media.

Early impressions and practical advice
After spending time with this feature, I'm genuinely impressed. The movements feel natural, and the cost isn't unreasonable compared to alternatives.
Pro Tips:
Describe both the main subject's action and the things that happen in the background. The bot understands two or three actions in my early testing.
Describe the camera movement e.g. camera zooms in, camera zooms out, shaky camera movement (esp. during explosion) to drastically enhance the output
Use manual mode and --raw when you have specific movement ideas
Save your fast hours by using relax mode when available
Avoid using Upscaled or high-resolution materials. The bot will compress your video to 480p so you may get more artifacts. Use the "normal" images (before Upscale) or lower resolution images.
The 21-second extension capability opens up real possibilities for short social media content or presentation elements. While 480p isn't spectacular, it's workable for many online uses.
This feels like a solid foundation. Version 1 limitations aside, Midjourney has created something genuinely useful here. I suspect we'll see rapid improvements in quality and features over the coming months.
The image-to-video space just got a lot more interesting, and this tool puts decent video generation within reach of anyone already using Midjourney.
Related articles
Key takeaways
Midjourney has launched a new image-to-video feature that converts any static image into a 5-second video, extendable up to 21 seconds, with smooth and natural-looking motion that retains Midjourney’s distinctive aesthetic.
The feature offers four modes for animation: auto low/high motion and manual low/high motion, allowing users to choose between subtle or dramatic movements, with manual modes providing more precise control via prompts.
Available exclusively on midjourney.com, this video generation costs about eight times more GPU time than image generation but remains roughly 25 times cheaper than other AI video tools; output quality is 480p, suitable for social media and presentations, with ongoing improvements expected very soon.
Cover prompt: the discovery of a new world --ar 16:9 --sref 2996354975 --profile r3snb8i --sv 4 --v 7
I hope you like this article!
Thank you for reading and happy creating!