Syntax Sunday: Runway ML's Gen 3 Alpha Text-to-Video Model
Last updated
Last updated
In this edition of #SyntaxSunday, we will explore Runway's Gen-3 Alpha model, their latest video model, which offers cutting-edge text-to-video capabilities. It is currently one of the best (if not the best) text to video generation models currently available and has very similar capabilities to OpenAi's Sora, which has not been released yet.
Given that this is the Alpha release, we need keep in mind that this version serves as the initial iteration and is incomplete. There are likely bugs and things may not work correctly yet. This does cost you some money to use though, which is explained in the next section
I will show examples of the prompts I used to generate a few videos. I will also showcase some videos using the prompts from the Runway ML blog post.
Pricing is a little bit funky, as you need to purchase a monthly or yearly plan. Once you purchase a plan you are given credits which can be used to generate videos.
Gen 3 Alpha costs 100 credits per 10 second generation ($1 USD/10 second video) , so it is quite expensive, which is to be expected.
You can also purchase more credits if you run out. $10 USD = 1000 credits = 10 -> 10 seconds videos.
There is an unlimited plan, but there is a caveat as after you run out of credits, video will be generated using Explore mode which is slower. Learn more here...
Runway's Gen-3 Alpha is the cool new kid on the block in text-to-video tools, taking visual content creation to a whole new level. It is the next iteration in their ongoing research project titled: General World Models.
It's all about creating videos that look super real and can be tailored just the way you like it, all from simple text prompts. Gen 3 Alpha currently only has Text to Video.
Gen-3 Alpha is trained on both videos and images, it is supposed to be much better at getting those small details right in your scenes by using detailed captions for smooth transitions between elements.
As well they mention that it excels at generating expressive human characters with a wide range of actions, gestures, and emotions. So we will give that a try!
Previously, Runway ML's best video model was the multimodal Gen 2 model. As it is multimodal you can use either:
Text to Video
Image to Video
Image + Text to Video
Plus other customization options
I have not used Gen 2 extensively so I cannot comment on it too much, but it creates 4 second video clips, which you can extend up to 16 seconds total. In the few that I did create, they really lacked motion compared to Gen 3 Alpha.
Lets first try an example prompt from the Runway ML video tutorial.
Next lets try a few of the example prompts from the official blog post
Here are a few of the example I tried... For the most part I followed the Gen 3 prompting guide.
And finally, my favorite!
Gen-3 Alpha is quite impressive, especially in terms of the quality, yet a few issues currently limit its utility. As you can see, some videos turn out great and others are garbage!
Here are a couple things I liked:
The video generations are pretty quick, usually only taking a couple minutes for a 10 second clip.
The video quality is great (720p) and you are able to export/download all your videos.
It does excels in creating lifelike human characters with diverse actions, gestures, and emotions for rich storytelling.
It also does a great job with translating text into compelling visuals.
There are definitely some issues with more complex prompts and from time to time it just does not work. With enough iterations for more simple prompts, you should be able to get something pretty close to what you want, but it is going to cost you. The biggest hurdle right now is the cost, but this will come down in the future!
Here are a couple issues I noticed:
It seems to struggle with complex prompts and not follow instructions very well. The level of detail remains consistent, but its behavior tends to be unpredictable.
At times, the detail varies greatly in quality. It seems to have trouble with non-human characters. For example, I could never get an "Alien" to work out right... (see the previous examples)
I had maybe 1 or 2 videos, look as good as the examples from the Runway blog post. They likely ran these example prompts multiple times, and picked the best ones.
Overall it is fun to use and you can create some interesting videos. As this is only the Alpha version they are likely improving it as more people use it and they gather feedback.
You can definitely create some cool videos if you are willing to pay, have an imagination, and some free time!
If you want to try out Gen 3 Alpha, create an account at: https://runwayml.com/.
If you have any questions about the video generations or using Gen 3 Alpha contact me at: bloodlinealpha@gmail.com.
Syntax Sunday
KH