If you can imagine it you can create it. The text-to-video era has come…

Do you know this Google backed New York–based startup that has made its first attempt to disrupt the movie industry? It’s called Runway ML, and it’s in fact an AI research lab and platform with some very impressive achievements so far in the world of AI, like their AI-based tool Gen-1 where users use words and images to generate new videos out of existing ones. No studio, no lights, no actors, no nothing, just AI (read more here). Well, now the company, at a valuation of around $1.5 billion, has officially released Runway ML Gen-2 and it is available to everyone. Everyone can try to create a video from a text description, yes, you haven’t misheard us, from text description. In other words, if you can describe what you want, you are going to get it! This time it’s even simpler than the one generation before, No lights. No camera. All action. In their own words: ‘It’s like filming something new, without filming anything at all’.

source: Twitter

Of course, one could watch some footages created with Gen-2 here and there prior to the official launch but this time it’s the voice of the people that we are listening to, thus we had to wait for the dust to settle down. It’s clear now. Let’s take a plunge into the rabbit hole and see what’s in there, how it works and if it’s really a disruption to anything, please take a seat and start reading. Shall we?

First, what’s the noise about? Well, Gen-2 lets users create videos simply by using prompts, text, like the workflow we know from Midjourney, single line even or existing images. It gives more customization options and a better output. What is more it doesn’t require any source video like Gen-1 used to in order to create new videos. So text-out the scene and the scene will be created, or use a picture and describe how you want the Gen-2 to take it from there. Do you remember Hogwarts walls covered with living pictures of the wizards, that’s pretty much it, you can get it from Gen-2 now.

artist: Martin Haerlin / made with: Gen-1, Elevenlabs & Reface / source: Twitter

Even though, the new upgrade of the Runway ML’s multi-modal AI system is truly incredible, it comes with some limitations, like currently provided 100 seconds of free video generation, low framerate, pixelations and a visible graininess, which makes the creations look a bit like a slideshow. The engine seems to struggle with human shapes also, like a well-known issue we’re all pretty familiar with from the Midjourney experience back in the days, namely legs or arms. In addition, one that really jumps out here is the overall doll-like appearance of faces and lack of ability to reflect emotions. However, one should remember it is a diffusion-based model and being such it learns through training. It need billions of examples to get better on what it’s being used for. So far, it has been trained on 240 million images and 6.4 million video clips*. Therefore, it is just a matter of time and we will all be witnessing great improvement. By then, we should all be rather looking at it as a novelty and not something that may truly be acting as a professional tool for video producers.

The enthusiasm is enormous, and we love seeing the AI enthusiast getting their hands on the new tool. Twitter is red-hot with very creative use-cases and ways to use the Gen-1 and Gen-2. One of the early-adopters here that stands out is Uncanny Harry AI (@Uncanny_Harry). This incredibly talented person, ‘videomancer’ AI filmmaker experimenting with emerging AI filmmaking tech’, is sharing his experience on the process of making the generated video better and so that we can all benefit from. Give him a follow to stay on top of the game.

‘Top tip for #Gen2 by @runway. At present it doesn’t cope with a lot happening in one shot, it will take quite a few generations to get a useable 2-shot. So you need to use the Kuleshov effect, to cut between shots to create meaning. Watch the scene below and see how the tension is created by the cuts and supported by the music.’

artist: @Uncanny_Harry / made with: Gen-2/ source: Twitter

‘…Always ground your generations in a place and with a lighting style (studio, daytime, etc). This helps you keep a consistent “look” and colour grade across a scene. This is how I construct my prompts for a photorealistic generation: “In the style of a cinematic shot of {subject} {action} {location} {lighting} —i” . If you’re doing a real world drama in a bar then give it a location as well {New York Bar} if your doing something fantastic then ground it with a reference “futuristic cyberpunk city”. Include the location and lighting style in the same place in every prompt that you want to take place in that scene.’

artist: @Uncanny_Harry / made with: Gen-2 / source: Twitter

If you are interested in getting deeper into the possibilities of AI generated video you should also follow the author of the amazing viral ‘snapping’ video director Martin Haerlin (@Martin_Haerlin), associated with german-based Trinity Agency. An amazing source of valuable insights from the video frontlines.

‘#1 If you work with a reference image / style frame, desaturate it. This is an image I generated in Midjourney, and the desatured version I used in @runwayml

source: Twitter

If you want to get to know how the ‘snapping’ video was made, make sure to visit Martin’s Twitter or follow his YT channel where he will be sharing his insights. There is also a tutorial on how to make your own version of it on Runway ML’s official Twitter profile here.

We are really looking forward to seeing what happens next, it is history in the making. To end on a positive note here let us quote a very remarkable and right-on-point comment on the birth of the new technology by Theoretically Media_ (@TheoMediaAI). For those interested in starting their video making adventure with Runway ML’s Gen-2 he prepared a great tutorial on how to. Make sure to check it out below.

‘Gen2 and all the other AI video platforms that are on the horizon are NOT the “Death of Hollywood”. They are tools that will help VFX/filmmaking pros achieve amazing things, and open the door to a lot of creative indie work. No death, only more creation.’

Enough said, Bravo!