AI + Video

How to Clone a Voice with AI: The Complete Beginner's Guide

Learn how to clone a voice with AI. Create personalized, professional-sounding voice overs plus easy translation into foreign languages.

AI voices have gotten so good lately that you no longer need to record a voice over to add a personal, human touch to video creation. Even better, you can create voice overs that sound like you (or one of your team mates) with voice cloning.

Using a familiar voice for your voice overs is better for customer trust and brand consistency—all without ever getting out your recording equipment.

What is voice cloning?

Voice cloning creates an artificial copy of someone’s voice using AI. Generally, there's two steps involved:

First, the AI analyzes a voice sample with machine learning, picking out individual qualities of the voice, like intonation, tenor, tempo, and accent.
Then using the data it was trained on, the AI recreates a synthetic version of the voice, matching those qualities as closely as possible.

When done with the leading tech and a long enough sample, the cloned voice should be nearly indistinguishable from the real thing.

It’s fairly simple to clone a voice with AI as you have the right tool. We recommend Kapwing’s Voice Cloning tool. It’s built right into our full-suite video editor which makes it easy to add voice overs with your new AI voice.

How to clone a voice with AI

Here’s a quick, step-by-step guide for how to make your own voice clone with Kapwing.

Step 1: Open a new project in Kapwing

Head to Kapwing.com and select the “Create new” option at the top of your workspace.

Open the audio tab on the left-side menu.

Select “Text to Speech” from the audio tab options.

Step 2: Add a new voice

Open the dropdown “Voice” menu and select “Add new Voice.”

This will open the Voice Cloning window, powered by ElevenLabs, where you can start to create your new AI voice clone. Any voices cloned here will be saved to your workspace’s Brand Kit and will be available to use for you and your team in any future projects.

There are two ways to clone a voice inside Kapwing:

1. Clone a voice from a sample

This option creates a clone by training the AI on existing audio samples. If you have video or audio files already stored on your device, drag and drop them to the upload window or click to search your device.

You can add up to three samples here. Ideal voice samples should have clear audio with only a single speaker. We recommend adding all three samples, keeping each file around 1 minute in length.

2. Clone a voice from a recording

If you don’t have an audio sample handy, you can always record your voice right in Kapwing. Select the “Record Sample” option in the Voice Cloning window.

Make sure your microphone settings are correct then start recording. You can read from the sample script, from your own script, or just talk off the cuff. The goal is to record yourself speaking with a range of inflection and tone for 1-3 minutes.

Try listing off what you plan on doing that day or reading a passage of text written by someone else if you don't know what to talk about.

Whether you record your voice sample or upload existing recordings, give your new Voice a name (yours, if you’re cloning your own voice, or the name of whoever’s voice you’ve received permission to clone).

Remember, this Voice will now show up in your text-to-speech voice options, so you should know whose voice it is.

Finally, check the box confirming that you have permission to clone this voice. Please read the full disclosure and make sure that you’re complying with the legal requirements—Kapwing does not condone or endorse the cloning of voices to which you do not retain full rights and permission.

When you’re ready, hit the “Clone” button and Kapwing’s AI will start analyzing the samples and synthesizing your voice. This process may take a few minutes.

Step 3: Add the new voice clone to your project

To add your voice clone to a video project, open “Text to Speech” menu within the audio tab again. Open the dropdown Voice menu. You should now see the Voice you cloned and named among the options. It will have the three card icon next to it, indicating it’s a cloned voice.

You can preview what the voice sounds like by hitting the play button next to it. If you’re not happy with it, you can generate a new voice clone. The best way to make it sound more like you is to increase the length and variety of the voice samples you provide.

If you’re happy with how the voice sounds, simply type or paste your voice over script into the text box above and hit generate. Kapwing will add the voice over as an audio layer in the project timeline and generate automatic subtitles as well. Additionally, you can convert audio to text if you want an audio transcription to go with your video.

To manage your saved voice clones, open the Brand Kit in your workspace. Here you’ll be able to see all saved Voices as well as remove any that you no longer want.

And that’s it!

You can now use these cloned voices for any future projects.

How to use your AI voice clone

There are plenty of applications for voice cloning, including but not limited to:

Adding professional-sounding voice over to videos

Add a voice over to TikToks, YouTube videos, internal training videos, etc. While you can of course accomplish this with regular AI voice over, using a cloned voice will help you create consistent branding across all of your content with familiar voices.

Instead of a generic sounding voice over, you can add a voice over that sounds just like you. Or if your company has one specific spokesperson/”Face of the Brand,” you could create a voice over that sounds like them, instead.

Correcting/updating existing voice overs without re-recording anything

Whether you missed a flubbed take, updated your script after pressing record, or just needed to refresh an outdated video with a new voiceover, voice cloning lets you seamlessly update content without needing to re-record anything.

Reach new audiences by translating your voice clone

Looking to localize video content for new audiences? Go beyond adding translated subtitles and actually translate your spoken audio with voice cloning. With Kapwing, you can translate your AI-cloned voice to over 20 languages, making a second or third language sound like your first.

To do this in Kapwing, first translate your subtitles into the desired language.

When that’s finished, download the subtitles as a .txt file and copy the text to your clipboard.

Open the Text to Speech window in the audio tab again and paste the copied text into the text box. Select the desired language as your “Language of Text Input” then select your saved voice clone as the Voice.

Finally, delete the original voice over layer from the timeline, which will remove the duplicate subtitles. Now you sound like you speak fluent Italian (or whichever supported language you want to translate your video to).

Voice Cloning FAQs:

Is AI voice cloning legal?

Yes, under the correct circumstances.

You may clone your own voice, of course, and someone else’s voice as long as you have the appropriate rights, consents, permissions, and/or licenses. However, it is illegal to create a voice clone of someone’s voice without their explicit consent.

Further legal protections are currently being considered by the U.S. House of Representatives to keep creators and consumers alike safe from deep fakes and copyright infringement. The best way to stay out of hot legal water is to only copy a voice you know you have permission to clone.

Is it possible to mimic someone’s voice?

Yes, thanks to AI. You can create an AI-powered voice double with tools like Kapwing’s Voice Cloner. You should only clone voices you have explicit permission or legal rights to use.

What can voice cloning be used for?

You can use voice cloning for many of the same purposes you use regular text-to-speech voices—create engaging social media videos or ads, add consistent voice overs to training videos, update old videos with new voice over instead of rerecording. The difference is that with voice cloning, the text-to-speech voice will sound like your own (or that of whoever gave you permission to clone their voice).

Regardless of your use case, it's critical to have the needed rights and permissions to avoid legal issues and other trouble. Make this a priority or use your own voice if you'd simply like to try out these tools.