Yes. It is free with no signup, no rate limit, and no paywall on advanced features.

Does it store my inputs?

No. Every calculation runs in your browser using JavaScript. The numbers and text you type never leave your device; we do not log or transmit form values.

Where do the formulas come from?

Formulas come from the canonical reference for the domain (the relevant tax authority, standards body, or peer-reviewed source). See /methodology for the full sourcing process and per-page citations.

What is Remove Duplicate Lines?

A Remove Duplicate Lines computes remove duplicate lines from the inputs you provide. It applies the standard formula to the values you enter and returns the result instantly, without sending any data to a server. Strip exact-duplicate lines from a pasted list while keeping the original order of the first occurrence.

Remove Duplicate Lines

Strip exact duplicate lines from any list. Keeps the first occurrence in its original position.

Browser-onlyInstantFree foreverWorks offlineNo signup

← Utilities

TLDR

Walks the input line by line. Each new line is kept the first time you see it; later duplicates are dropped. Order of first occurrence is preserved.

Input text

0 words · 0 chars

Result

Runs entirely in your browser. No upload, no signup, no logging. Output is for personal or commercial use; we don't claim any rights.

How to use this tool

Paste your list. Drop in the email list, URL list, tag list, or any line-per-item content.
Press Dedupe. Duplicates disappear; first occurrences stay in their original positions.
Copy the unique list. Paste back into your spreadsheet, mailing tool, or wherever you need a clean list.
Lowercase first for case-insensitive dedupe. If 'apple' and 'Apple' should match, run through the lowercase converter before deduping.

Real-world scenarios where this tool helps

Email list cleaning

Paste a list of addresses from multiple sources and get a unique list out.

URL deduplication

Combine link lists from several pages and dedupe before crawling.

Tag list cleanup

Pasted tag exports from CMS often contain near-duplicates; this catches the exact ones.

Log triage

Reduce a noisy log to its distinct lines for pattern review.

Vocabulary lists

Build a unique word-frequency list by pasting word-per-line input and deduplicating.

What this tool does

Splits the input on LF and CRLF line endings.
Keeps each unique line the first time it appears.
Drops every later occurrence of an exact match (case-sensitive, whitespace-sensitive).
Preserves the original order of first occurrences.
Joins the survivors back with LF.

What it does NOT do

Does not match case-insensitively. 'apple' and 'Apple' are different lines.
Does not normalize whitespace. 'apple ' (trailing space) and 'apple' are different lines.
Does not sort the output - order matches first-appearance order.
Does not detect fuzzy duplicates - typos slip through.
Does not save anything.

About duplicate line removal

Deduplicating a list sits behind a surprising amount of everyday plumbing: SQL SELECT DISTINCT, Unix sort -u, Python set(), and spreadsheet "Remove duplicates" all do the same job at different layers. This page is the browser-only version. You paste a list, the tool keeps one copy of each unique line in the order it first appeared, and the output is yours to copy. No upload, no API call, no length cap.

The dedupe key is the raw line string. Two lines are the same only if every byte matches, including case, accents, leading and trailing whitespace, and zero-width characters. That is deliberate: loose matching is destructive, so we leave it as a separate step the user controls.

How the algorithm works

seen = empty Set
for each line in input.split(/\r?\n/):
    if line not in seen:
        seen.add(line); output.push(line)
return output.join('\n')

Cost is O(n) time and O(k) memory, where n is input lines and k is unique lines. Set lookup is amortised constant time because modern engines back Set with a hash table. A 100,000 line list dedupes in well under a second; the bottleneck is textarea repaint, not the matching loop.

Worked example

Input (10 lines, mixed case, 6 unique):

alice@example.com
bob@example.com
alice@example.com
carol@example.com
Alice@example.com
bob@example.com
dave@example.com
alice@example.com
eve@example.com
carol@example.com

Line 1 is new, kept. Line 2 is new, kept. Line 3 matches line 1 byte for byte, dropped. Line 4 is new, kept. Line 5 (Alice with uppercase A) differs by one byte and is kept as a separate row. Lines 6, 8 and 10 match earlier ones and are dropped. Lines 7 and 9 are new.

Result (6 lines): alice@example.com, bob@example.com, carol@example.com, Alice@example.com, dave@example.com, eve@example.com. The case-different Alice survived. Lowercase the list first if that is wrong for your use case.

Dedupe options across common tools

Tool	Syntax	Case	Order
This page	paste and click	sensitive	preserved
Unix sort	`sort -u file`	sensitive	ASCII sorted
awk one-liner	`awk '!s[$0]++'`	sensitive	preserved
Python	`list(dict.fromkeys(x))`	sensitive	preserved (3.7+)
Excel	Data, Remove Duplicates	insensitive default	preserved
SQL	`SELECT DISTINCT col`	collation dep.	not guaranteed

Common mistakes and pitfalls

Assuming case-insensitive dedupe. Lowercase your list first if case should not matter. Email addresses are technically case-insensitive in the local part but the SMTP servers that read them often treat case as significant; pick a rule before deduping.
Ignoring trailing whitespace. Spreadsheet paste, Word copy, and many CMS exports carry trailing spaces that are invisible to the eye but break exact-match dedupe. Trim each line first if survivors look duplicated.
Treating dedupe as deduplication-by-meaning. New York City, NYC, and New York, NY are three different strings and all survive. Aliases need a lookup table, not a hash set.
Pasting CSV rows and treating them as lines. A quoted field with an embedded newline is one logical row but multiple physical lines, so the row gets split and duplicates pass through.
Mixing Unicode normalisation forms. e + combining acute accent (NFD) and é precomposed (NFC) look identical but hash differently. If two sources contribute the same word and the tool keeps both, run them through a normaliser.
Forgetting the order rule. First occurrence wins. If you sort your input before pasting, the kept copy will be whichever sorted first. If you pasted in chronological order, the earliest event wins.

Frequently asked questions

Is the match case-sensitive?

Yes. The dedupe key is the literal line string, so 'apple' and 'Apple' are different lines. If you want case-insensitive dedupe, run the input through a lowercase converter first, then paste the result back here.

What about trailing whitespace?

It counts. 'apple ' (with a trailing space) and 'apple' are different lines. Spreadsheet exports often add invisible trailing spaces, so trim first if your duplicates are not getting caught.

Does it sort the output?

No. The tool preserves the order of first occurrence. The line that appears earliest in the input survives; later duplicates are dropped. To sort the result, paste into a spreadsheet column or use a follow-up sort tool.

Will it work on very long lists?

Yes. The algorithm is O(n) with a hash set: it touches each line once and looks up duplicates in constant time. Lists of 100,000+ lines run in well under a second on a modern laptop, limited mostly by textarea paint speed in the browser.

Can I dedupe by partial match?

No. This tool matches whole lines exactly. For fuzzy or partial matches (collapsing 'New York City' and 'NYC' to one row, or matching by the first 50 characters), use a script with regex or a fuzzy-match library.

What if my list has CSV-style embedded newlines?

Embedded newlines inside quoted CSV fields will break the line-based logic, because each physical line break is treated as a row boundary. Parse with a real CSV reader (Excel, Google Sheets, Python csv module) first, dedupe in the parsed form, then export back.

Will it handle Unicode and emoji correctly?

Yes. The matcher uses JavaScript string equality, which is code-point exact. Emoji, accented characters, and non-Latin scripts (Cyrillic, CJK, Arabic, Devanagari) all dedupe correctly as long as the visible character sequences are identical. Normalisation forms (NFC vs NFD) can produce visually identical strings that hash differently, so paste from one source when possible.