About this tool
Combining multiple audio files into a single track is the standard task for compiling podcasts (intro + segment + outro), creating playlists as single files, or stitching field recordings together for editing.
This tool merges in your browser using Web Audio API. Add files in the order you want them, optionally apply a crossfade at each junction (default: 200 ms). Output preserves the source codec where possible; mixed-format inputs are converted to a common output format.
How it works
Add audio files
Drop multiple files at once or pick them in sequence. They appear in the order added.
Reorder if needed
Drag rows to change order.
Optional crossfade
Set 0-2 seconds of crossfade for smooth transitions between segments.
Pick output format
MP3, WAV, or M4A. WAV is lossless; MP3 / M4A re-encode at chosen bitrate.
Merge and download
The combined audio saves to your device.
Use cases
Podcast assembly
Intro music + interview + outro music as one MP3 ready to upload.
Music playlist as one file
Combine 5 song stems into one continuous MP3 for offline listening.
Audiobook chapters
Stitch chapter MP3s into a single file for car / device that does not auto-advance.
Voice notes compilation
Combine a week of voice memos into one weekly review file.
Format and spec details
| Source formats | MP3, WAV, AAC, OGG, FLAC, M4A |
|---|---|
| Crossfade | 0-2 seconds linear |
| Output formats | MP3 (192-320 kbps), WAV (lossless), M4A (AAC) |
Tips and best practices
- For podcasts, fade the music intro into the voice with 1-1.5s crossfade.
- Match input loudness with Audio Volume Boost before merging - one quiet track ruins the whole compilation.
- WAV output keeps the most fidelity but creates larger files (10x bigger than MP3 typically).
How browser-based audio/video tools work
Modern browsers ship with Web Codecs API, MediaRecorder, and Web Audio API - enough to decode, manipulate, and re-encode most media formats client-side. This tool uses those APIs (with FFmpeg.wasm as a fallback for less common codecs).
The processing flow
- File is loaded as a binary Uint8Array.
- The codec is detected from the container (MP4 = MPEG-4, MKV = Matroska, WebM = WebM) and the codec atoms.
- Frames are decoded into raw audio samples (PCM) or video frames (YCbCr / RGB).
- The requested transformation (trim, convert, resize) is applied frame-by-frame.
- Frames are re-encoded into the output codec and packaged into the output container.
Common audio/video formats
| Container | Common codecs | Best for |
|---|---|---|
| MP4 | H.264 / H.265 video, AAC audio | Universal compatibility; default for web video |
| WebM | VP9 / AV1 video, Opus audio | Open-source web standard; smaller than MP4 |
| MKV | Any codec (container only) | High-quality archival; not browser-native |
| MOV | ProRes / H.264, PCM / AAC | Apple ecosystem; ProRes for professional editing |
| MP3 | MP3 audio only | Universal audio; lossy |
| WAV | PCM audio (lossless) | Editing source; CD-quality archival |
| FLAC | Lossless compressed audio | Music archival; ~50% of WAV size, perfect quality |
| AAC / M4A | Advanced Audio Coding | iOS default; better quality than MP3 at same bitrate |
Lossy vs lossless
- Lossy (MP3, AAC, Opus, H.264): discards data the human ear/eye can't notice. 80-90% size reduction. Each re-encode loses more quality (generation loss).
- Lossless (FLAC, WAV, ALAC, FFV1): bit-perfect reproduction. ~50% size of raw. Each re-encode is identical to the source.
Bitrate quick reference
| Use case | Audio bitrate | Video bitrate (1080p) |
|---|---|---|
| Voice (phone, podcast) | 32-64 kbps | n/a |
| Music (mid-quality) | 128 kbps MP3 | n/a |
| Music (transparent) | 256-320 kbps MP3 or 128 kbps Opus | n/a |
| Streaming HD | n/a | 5,000-8,000 kbps |
| Streaming 4K | n/a | 15,000-25,000 kbps |
| Archival | FLAC lossless | ProRes 422 or H.265 CRF 18 |
Privacy and offline operation
Every operation in this tool runs client-side using your browser's built-in APIs (Canvas, Web Audio, WebAssembly). No data is uploaded. After the initial page load you can disconnect from the internet and the tool keeps working.
We use Google Analytics and AdSense for the page itself, but neither sees the content of the files you process.
Frequently asked questions
Will merging change the audio quality?
WAV output is lossless. MP3 / M4A output re-encodes; stick with 320 kbps for podcast-quality.
What if the inputs have different sample rates?
They are auto-resampled to a common rate (highest source).
Can I add silence between tracks?
Yes - the crossfade option supports overlap; silence (no crossfade, gap) is not currently in the tool but you can pre-add silent track sections.
Is my video uploaded anywhere?
No. All processing happens in your browser using Web Codecs / FFmpeg.wasm. Files stay in your tab's memory. Disconnect from the internet after page load and the tool still works.
Why is conversion slow?
Video re-encoding is CPU-intensive. A 1-minute 1080p clip can take 30-90 seconds to encode in the browser - desktop apps with hardware acceleration are 5-10x faster. Use this tool for short clips; for hour-long footage use HandBrake or FFmpeg on your machine.
Will the converted file lose quality?
Yes, slightly, if the source and destination are both lossy formats. Going from H.264 to H.264 at the same bitrate adds a small amount of generation loss. Going from H.264 to a lossless codec preserves the existing quality but doesn't restore what was lost on the first encode.
Can I convert between any two formats?
Most common pairs (MP4 <-> WebM, MP3 <-> AAC, WAV <-> FLAC) work in any modern browser. Exotic codecs (ProRes, FFV1, JPEG 2000) may require FFmpeg.wasm and run slowly.
What's the maximum file size I can process?
Practical limit is your browser's available memory (typically 2-4 GB). 30-minute 1080p videos process fine. 2-hour 4K source files may crash the tab; use a desktop tool for those.
How accurate is the Audio Merge?
It applies the standard formula. Accuracy is limited only by your input precision. For decisions with material consequences (taxes, medical, legal, structural), use the result as a starting point and verify with a qualified professional in the relevant field.
Is the Audio Merge free to use?
Yes. 100% free, no signup, no payment, no API key. The site is funded by display ads around the tool but not inside the calculation flow.
Are my inputs saved anywhere?
No. All inputs stay in your browser tab. Closing the tab discards them. The site uses Google Analytics for traffic measurement (anonymized) but the analytics never see what you type into the form.
Can I use the Audio Merge on my phone?
Yes. The tool is responsive and tested on iOS Safari, Android Chrome, and major desktop browsers. Touch targets meet Apple's 44pt and Google's 48dp minimum.
Does the Audio Merge work offline?
Yes. Once the page has loaded, it works without internet. The calculation runs in JavaScript on your device.
How do I report a bug or suggest improvement to the Audio Merge?
Email hi@3tej.com with the URL of this page and a description of what you saw vs expected. We typically respond within 72 hours.
Can I share results from the Audio Merge?
Take a screenshot or copy the output. The page doesn't generate shareable URLs for specific calculations - inputs stay in your browser only.
Why are the results different from another audio merge tool?
Most likely: different formula assumptions, different default values, different rounding rules, or different applicable rates. Check the methodology if both tools document it. Both can be valid for different scenarios.
