About the CSV-to-JSON converter
The converter parses a CSV (or TSV, semicolon, or pipe-delimited) text input following the RFC 4180 grammar and emits a JSON array where each row becomes either an object keyed by header names or an array of column values. It runs entirely in the browser so spreadsheets containing personal or commercial data never leave the device.
CSV is the lowest-common-denominator interchange format. Every spreadsheet, database, and ETL pipeline reads it. JSON is the native structure for modern web APIs, NoSQL stores, and JavaScript front ends. Converting one to the other is the single most common data-prep step before posting to a REST endpoint, importing into MongoDB or DynamoDB, seeding a fixture file, or feeding a JavaScript chart library. This tool gives you a deterministic, RFC 4180 compliant translation that you can paste straight into Postman, a curl body, or a fixture file with no escaping surprises.
How it works
The parser walks the input character by character, tracking whether the current position is inside a quoted field. Outside a quote, delimiter and newline characters split fields and rows; inside a quote, they are treated as literal data until a closing quote is seen. A doubled quote inside a quoted field becomes a single quote in the output. This single-pass state machine is what differentiates an RFC 4180 parser from a naive line.split(',') approach that silently corrupts any value containing a comma.
When the Auto delimiter is chosen, the script counts comma, semicolon, tab, and pipe occurrences in the first line and picks the most frequent. This heuristic matches the auto-detect rule used by Excel's Text Import Wizard and by Python's csv.Sniffer. The output writer then emits JSON.stringify with a 2-space indent, which is the de facto pretty-print convention for human-readable JSON and is accepted byte-for-byte by every linter and language runtime.
RFC 4180 grammar (informal):
file = record *(CRLF record) [CRLF]
record = field *(COMMA field)
field = (escaped / non-escaped)
escaped = DQUOTE *(TEXTDATA / COMMA / CR / LF / 2DQUOTE) DQUOTE
non-escaped= *TEXTDATA
Output:
if header: [{col1: v, col2: v, ...}, ...]
else: [[v, v, v, ...], ...]
- Header mode: first row becomes object keys for every subsequent row.
- Array mode: every row becomes an array, preserving original column order.
- Quote escaping: "" inside a quoted field produces a single " character.
Worked example
Convert this CSV with header to JSON:
name,age,city "Smith, John",42,Boston "Doe, Jane",37,"New York, NY" "O""Connor",29,Dublin
- Detect header: first row reads as ["name", "age", "city"].
- Parse row 2: quoted "Smith, John" preserved; 42 and Boston unquoted.
- Parse row 3: both fields quoted because each contains a comma.
- Parse row 4: doubled "" inside the quoted field unescapes to a single ".
RFC 4180 standard reference
| Rule | Specification | Example |
|---|---|---|
| Row terminator | CRLF (\r\n) | Lines end with \r\n (LF-only is widely accepted) |
| Field separator | Comma (,) | a,b,c |
| Optional header | First line if present | name,age,city |
| Quoted field | Wrap in double quotes | "value with, comma" |
| Embedded comma | Must be inside quoted field | "Smith, John" |
| Embedded newline | Must be inside quoted field | "line 1\r\nline 2" |
| Embedded quote | Escape by doubling | "He said ""hi""" |
| Whitespace | Significant; preserved verbatim | " a "," b " keeps surrounding spaces |
| Empty field | Empty string between separators | a,,c (middle is empty) |
| Character set | US-ASCII baseline; UTF-8 common | Files often start with a BOM marker |
Common pitfalls
- Mixed line endings. CSV files saved on Windows use \r\n; macOS and Linux use \n; old Macs used \r. The parser accepts any of the three but a downstream consumer might not.
- Locale-specific delimiters. European exports often use semicolon as the delimiter because the comma is the decimal separator. Choose the right delimiter from the dropdown or numbers split incorrectly.
- Numeric coercion. The output keeps all values as strings to preserve leading zeros, ZIP codes, and phone numbers. If you need real numbers, post-process the JSON.
- UTF-8 BOM. Some Excel exports prepend a byte-order mark to UTF-8 files. The first column name might appear as "name" if you do not strip it.
- Inconsistent column counts. A row with fewer columns than the header produces undefined values; a row with extra columns drops the surplus (or shifts keys). Validate row width before pasting large files.
- Embedded newlines in spreadsheets. Excel and Google Sheets can save cells with line breaks inside quoted fields. The parser handles them; a custom regex split usually does not.
Related tools and glossary
Frequently asked questions
Does the converter follow RFC 4180?
Yes. The parser implements the RFC 4180 grammar: fields separated by commas, optional CRLF row terminators, fields containing commas or newlines wrapped in double quotes, and embedded double quotes escaped by doubling. It also accepts a trailing newline and trims surrounding whitespace per the standard.
What if my CSV has no header row?
Turn off the 'header' option. The tool then outputs each row as a JSON array instead of an object, preserving column order. You can also pass synthetic keys (col1, col2, ...) by supplying a header row before pasting.
How does it handle commas inside a field?
A field that contains commas, newlines, or double quotes must be wrapped in double quotes per RFC 4180. The parser respects that wrapping: "Smith, John",42 produces ['Smith, John', '42'] not three columns.
Can I use semicolon or tab as a delimiter?
Yes. The delimiter dropdown accepts comma (default), semicolon (common in European locales where comma is the decimal separator), tab (TSV), and pipe. The grammar is unchanged; only the field separator differs.
Sources
- IETF RFC 4180, Common Format and MIME Type for Comma-Separated Values (CSV) Files (October 2005).
- ECMA-404, The JSON Data Interchange Syntax.
- Microsoft Excel and Google Sheets CSV export documentation.
