Generate subtitles alongside an editable transcript
Kapwing’s online Audio to Subtitles Converter turns any audio file — including MP3, WAV, and FLAC — into accurate, time-aligned subtitles in seconds. Just upload your file, hit “Auto Subtitles,” and the tool generates both a subtitle track and an editable transcript.
Upload your own images and videos, pull from a built-in stock library, or use AI to quickly design a media-rich video without any external tools or downloads. There's even waveforms, transitions, text overlays, brand watermarks, and animated subtitles inside the user-friendly interface. Turn podcasts into visual assets, provide accessible alternatives to lecture or tutorial audio, and repurpose phone interviews or voice recordings into engaging social media content.
Accuracy is essential when repurposing audio content for other formats. The Audio to Subtitles Converter ensures 99% accuracy using automated proofreading and precise line-syncing. This makes it easier for teams and individuals to manage high-volume workflows or subtitle entire audio libraries without sacrificing quality. Built for fast-paced schedules, the converter helps content creators and brand managers save time and effort on subtitling without compromising on accuracy, even when recordings contain multiple speakers.
With built-in translation support for over 100 languages, including regional variants like Spanish (Mexico) and Portuguese (Brazil), you can localize audio content without expensive workflows or external tools. Our studio includes a built-in Brand Glossary, where you can create and store custom translation rules for key terms, product names, slogans, and phrases. Reach native speakers in any global target market, whether you're crafting multilingual podcast promos, translating conference recordings, or promoting a book launch internationally.
The browser-based editor gives you full control of subtitles timing, formatting, and visual styling. Once generated from your audio file, fine-tune the transcript, adjust characters per line, and choose from thousands of subtitle styles, including font, size, color, and background settings. Use Speaker Labels to draw attention to key-speakers during interviews or any multi-speaker content and ensure key sound bites are highlighted through animations.
Kapwing users make audio conversion a key part of their content strategy
Podcasters use Kapwing to add subtitles to audio recordings for free, creating synced visualizers with branded text and visuals for sharing on video platforms and websites
Small business owners and podcasters use the online Audio to Subtitles Converter to create audiograms that share interview highlights, product promos, and memorable moments on platforms like TikTok and X (Twitter).
Online educators create subtitles from audio and generate and transcripts from lectures and tutorials, making content more accessible and expanding their reach to diverse learners for free
Journalists use the Audio to Subtitles tool to transcribe interviews and export as plain text files for articles. Transcripts support both written content and can be paired with video content to improve SEO and discoverability.
Social media managers and influencers turn short voice recordings and group chat conversations or confessions into viral videos by adding visuals, subtitles, and creative styling
Authors promote their books by reading key passages aloud and converting the audio into subtitles. Pairing voice, captions, and a book cover creates a simple yet powerful piece of marketing content for social media and websites
To begin, upload an audio to the Kapwing editor. Kapwing supports popular file types like MP3, WAV, FLAC, OGG, and more.
Click the "Auto subtitles" button in the left-hand toolbar under the "Subtitles" tab. Your subtitles will automatically generate, and the Subtitles Editor will open.
Edit your subtitle text, timing, and alter the font, color, and animation. At this stage, you can also add visuals and other edits. When you're finished, export your new file and/or download an SRT, VTT, or TXT transcript.
Yes, the Audio to Subtitles Converter is free for all users to try. If you're using a Free Account, you get access to auto subtitle generation for 10 minutes per month. Once you upgrade to a Pro Account, your auto-subtitling limit increases to 300 mins per month, and you get access to subtitle translation into over 100 languages.
If you are using Kapwing on a Free account then all exports — including the Audio to Subtitles Converter — will contain a watermark. Once you upgrade to a Pro Account the watermark will be completely removed from your creations.
Kapwing supports a wide variety of popular audio file formats, including MP3, WAV, WMA, M4A, OGG, FLAC, and AVI. Note that audio exports are always in MP3 format, as we feel this file type represents the best tradeoff between file size and quality.
To add subtitles to audio in Kapwing, follow these four simple steps:
Audio accessibility is the process of creating audio content that is accessible to everyone, including individuals with disabilities. The purpose of making audio accessible is twofold: it both promotes a more inclusive experience for listeners and helps creators access a larger audience. By bridging the gap for listeners who need accommodations due to hearing loss, you ensure that everyone has equal access an audio recording's information (or entertainment).
An example of an accommodation created for hearing loss is subtitles or captions, since they reproduce spoken content or environmental noises in visible text that viewers can read. On the other hand, choosing high contrast colors for subtitles/captions like blue and orange or black and white can help those with partial loss of vision read text more easily.
SDH refers to subtitles specifically designed for individuals who are Deaf or hard of hearing (hence "SDH" = "Subtitles for the Deaf and hard of hearing"). These subtitles not only capture spoken dialogue but also provide additional details like sound effects, music, and identification of speakers, assuming the viewer cannot hear the audio.
Content localization is the process of adapting content to fit the language and cultural preferences of a new audience. This often involves translating subtitles, dubbing the audio, and editing SRT subtitle files for timing, minor inaccuracies, and line lengths.
The goal of content localization is to showcase your content to new regions, increasing your market reach. Localization gives you a competitive advantage by helping connect your brand with customers in new places before competitors can reach them.
Full localization includes subtitle translation, but also involves making cultural adjustments, such as using country-specific references, different units of measurement, and culturally relevant visuals.
There are a couple ways to sync subtitles with audio. The first is a manual process, which requires using a plaint text editor like Notepad on PC or TextEdit on Mac. Every line of subtitles in an SRT or VTT file should have start and end timecodes, formatted as "00:00:00,000" (representing 0 hours, 0 minutes, 0 seconds, and 000 milliseconds) for SRT and "00:00:00.000" for VTT. You'll need to watch the video and modify the timecodes to match it properly.
However, you can skip over this tedious process by using Kapwing's Subtitles Editor, which provides you with an easy-to-adjust playhead, a chars-per-subtitle slider, and a one-click button beneath timecodes to set a subtitle line to the current playhead time. This greatly simplifies and accelerates the process of modifying the timing on an SRT or VTT file.
VTT is similar to SRT but offers more editing and styling options, making it more versatile, though it’s not always compatible with every social media platform. VTT supports additional features like metadata (e.g., title, author) and styling, making it more robust than the simpler SRT format. Here’s a quick comparison:
Kapwing is free to use for teams of any size. We also offer paid plans with additional features, storage, and support.