How Will GPT Change Video Editing
Much has been said about ChatGPT and new AI technologies that will change the ways people work. But how will it change video editing? Today, we introduced GPT tech into Kapwing. In this article, CEO Julia Enthoven shares her predictions for the future of Video Editing + AI.
Silicon Valley hasn’t been so active in years! We find ourselves in the center of a new hype cycle around Generative AI. In the last few weeks, we have changed our whole process to sprint on new GPT-powered features, and it’s been a long time when I’ve felt so energized about our product roadmap.
In case you don't want to read this article, Kapwing used our AI Video Generate to make a "How Will GPT Change Video Editing" summary video for you
Video edited on Kapwing
We’ve known since 2018 that AI would changed the way we do work. Last year, in 2022, Kapwing added 8 new ML-powered APIs into our video editing platform to automate tedious editing tasks. Creative teams love these features, and they represent our most popular and highly-retentive features on the website.
Today, we’re introducing a set of 5 Generative AI features powered by GPT-4. Hundreds of people are using them already. Make videos from just a topic description, generate images, get meme ideas, transform documents into summary videos, and create a social media video script. Try them for free on Kapwing.
This is just the beginning. There’s so much more to come with the help of OpenAI’s LLM. In this article, I’ll share my perspective on how GPT and Generative AI change the game for creators and how we envision this taking shape at Kapwing.
AI for Video Editing
GPT-4 unlocks a new horizon for AI applications. I envision that it will change video editing in these key ways:
1) Creative Brainstorming. AI helps to address the blank canvas, giving you a starting place to iterate from and new perspectives to consider. It will:
- Suggest new ideas and combinations for edit (fonts, sounds, face filters, effects, and more). Get inspired if you get stuck and experiment with new layouts, animations, and cuts.
- Help find stock media and generate original assets to incorporate into projects. You say "Barack Obama", and there's already a PNG overlay of his face available for you. You can browse songs with the right tempo and duration for your project. Sound effect suggestions immediately appear over jump cuts.
- Supply caption ideas for your social media posts, including memes, and thumbnails to cover them
- Curate trends and new formats you might enjoy
- Clip highlights in a long-form video
- Convert from one medium to another. AI can generate a visual to accompany your podcast, or a voiceover representation for a text article. File types become more versatile and transformable.
2) Personalized automations: Less repetitive work as AI remembers your actions, assists, and automates. AI will:
- Detect bounding boxes so you can edit each subject within a video and pull a video apart into its layers, retroactivley
- Recognize objects to find timestamps from text prompts, recognizing both the timeline (searching for keywords in a webinar, for example) and the canvas (like tracking a person's face)
- Auto-level the volume across your video and clean it up by removing background noise
- Remember your preferences, like your default output size or your intro sequence, and suggesting them in new projects
- Power recording tools like a smart teleprompter that moves forward as you speak and a computational camera that beautifies the speaker's face and enhances their voice as it captures
- Translate between languages with dubbed voiceovers and/or closed captions in other languages
3) Command interfaces that transform text prompts into precise timeline UI. AI will:
- Use quotations and scene changes to make cuts and clips, editing like a Google Doc ++.
- Converse with you to reverse mistakes, catch errors, and refine changes, just like you would a virtual assistant
- Chain sequences of actions together (like "make this text appear for one second every three seconds" or "optimize for TikTok")
- Generate scripts, shot lists, and storyboards from text descriptions
What Doesn’t Change
Storytelling and authorship: Humans love videos partly because it gives us a connection to another relatable person, making us feel less alone and closer to mankind. Outside of Westworld, we don’t have the same connection with machines; computers are too forgetful, infinite, objective, and perfectly reasonable for us to relate to and feel understood by. When I browse on TikTok, authentic storytelling represents most of the stories in my feed. It’s my belief that humans will tell stories better than machines for a long time, because it’s the human aspect that we relate to. So, Generative AI won’t replace TikTokers, musicians, poets, influencers, and performers, although it will certainly make them faster and help them to come up with new content ideas. Coming up with relatable stories, filming experiences, and crafting a narrative will still be the bulk of a creator’s work.
Originality: Similarly, AI won’t change the fact that people will want to insert their own personality and voice into videos they produce. Generative AI is good at coming up with average answers, but not good at coming up with a highly original spin on a topic. It can give you suggestions for memes, but creators will still want to put their own spin on it. That’s why it’s essential for AI to be baked into the tools themselves, so that all AI suggestions can be modified and curated.
The Art of Filmmaking: There are many ways to cut and tell a story, and AI certainly does not get us 100% of the way there. Video editors, digital storytellers, and media entrepreneurs need suggestive AI tools that can supplement their workflow with recommendations rather than prescriptive software that tries to replace their jobs. Prescriptive AI is doomed to fail. Instead, video makers need the ability to customize, change, remix, and refine AI suggestions to bring a creative vision to life.
Expense of Video Processing: It's still heavy-weight and expensive to process videos, especially if the source files are high resolution or long. Setting up a video editing environment require either expensive hardware or a dynamic cloud setup, and both come with their own limitations and challenges. The difficulty of processing video files is one reason it is still impossible to edit videos with ChatGPT, as of August 2024.
What Does Change
Investment in ML performance: We know from experience that fast, accurate AI workflows are essential for saving creators time. Video makers rely on responsive, fast interfaces in the creative process and will give up on slow, clunky processes. For example, last year, we improved the performance of our automatic remove background from video feature and saw a 6% uptick in weekly active users immediately after launch. We are increasing our investment in performance and will continue to rigorously compare available AI technologies on real videos/use cases.
Text as an input: The popularity of ChatGPT and Microsoft’s Copilot has changed the way consumers think about chat interfaces and productivity. Video editing is visual, but some commands demand a lot of precision and accuracy that’s hard to achieve with a mouse or trackpad. Video editing software of the future will leverage embedded text prompts to automate repeated workflows and guide users and text input to inspire and spark new ideas. Clippy reborn!
AI Development Velocity: Silicon Valley is seeing an unprecedented velocity around new AI products, and video-related AI is evolving quickly. But there’s still a lot that machines can’t do well. For example, machines have a lot to learn when it comes to drawing a bounding box around objects. Lacking a sense of object permanent, computer vision fails when a moving subject disappears momentarily, for example. As a result, we haven’t yet found an object tracking API that provides reliable enough results for an automated “pinning” solution. Our engineering team stays plugged in to recent developments through a close relationship with the Chrome team, and we’re thrilled at the AI developments that have moved into hyper speed in the last few months.
Today, we launched 5 products that we started development on less than two weeks ago. There’s much more to come. Stay tuned at kapwing.com/ai and sign up for email updates.
Conclusion
Creators and technologists alike are dreaming about how AI will change video editing workflows for the better and help creators get more efficient when telling stories and making content. We’ve had hundreds of people fill out our interest survey, telling us they’re interested in every area of video creation:
We’re using this survey and feedback, ideas, and requests from creators to guide our roadmap, so please reach out over email, LinkedIn or Twitter to let us know your thoughts. Stay tuned as we add new AI products into the Kapwing editor and chip away at the repetitive workflows that have haunted creators for decades.
If you’re a marketing, communications, or media team interested in leveraging AI to speed up content operations, reach out for a demo to see how you can put AI to work for your creative team.