Stable Video Diffusion is real!

Only a few hours back, Stable Diffusion has announced this breaking news:

“Today, we are releasing Stable Video Diffusion, our first foundation model for generative AI video based on the image model, @StableDiffusion.

As part of this research preview, the code, weights, and research paper are now available. Additionally, today you can sign up for our waitlist to access a new upcoming web experience featuring a Text-To-Video interface.”

and things are about to get wild! Let’s see what’s the whole fuss about.

The company has now entered the generative video space owned pretty much by Runway and Pika, until now. It seems they are not going to just linger in there, instead they are more like saying “it’s our turn now!” Stable Video is the Stability AI’s first open generative AI video model which empowers individuals to transform text and image inputs into vivid scenes and elevates concepts into live action, cinematic creations. 

Released in the form of two image-to-video models, capable of generating 14 and 25 frames at customizable frame rates between 3 and 30 frames per second, its performance looks very promising, surpassing the leading closed models.

“The chart above evaluates user preference for SVD-Image-to-Video over GEN-2 and PikaLabs. SVD-Image-to-Video is preferred by human voters in terms of video quality. For details on the user study, we refer to the research paper

If that is the case, we are super excited to see what can be done. What is more, the model understands 3D scene and perspective too, as one of the X users accurately pointed out. To access the model & sign up for the waitlist (:/), visit their website here. Some of the folks have already shared the first hands-on experience, or its effect to be precise.

“First text-to-video render of Stable Video Diffusion (SVD) from a Midjourney input image. I’m impressed with – coherent movement – video quality – accuracy with original image…”

Reference image (MJ)

Stable Video Diffusion (SVD)

Stable Video Diffusion (SVD) Upscaled to 1080p and interpolated to 24fps.

It is incredible! What is ahead of the gaming industry? Great times 🙂 Can’t wait.