JSONL Format Specification

Complete technical specification for JSON Lines (JSONL) format including MIME types, file extensions, standards, and implementation details.

1. Format Overview

JSON Lines (JSONL) is a text format where each line is a valid JSON object. It's designed for streaming large datasets and processing records one at a time without loading the entire file into memory. For a general introduction to JSONL, see our What is JSONL? guide.

Key Characteristics

  • • Each line contains exactly one JSON object
  • • Lines are separated by newline characters (\n or \r\n)
  • • No trailing commas or array wrapping
  • • UTF-8 encoding recommended
  • • Streamable and memory-efficient

2. Syntax Specification

Basic Structure

example.jsonl
{"name": "John", "age": 30}
{"name": "Jane", "age": 25}
{"name": "Bob", "age": 35}

Formal Grammar

jsonl-file = *json-object newline
json-object = json-value
json-value = object | array | string | number | boolean | null
newline = %x0A | %x0D %x0A

Unix/Linux (LF)

\n (0x0A)

Windows (CRLF)

\r\n (0x0D 0x0A)

3. MIME Types & File Extensions

MIME Type File Extensions Description
application/jsonl .jsonl Primary MIME type for JSONL files
application/x-jsonl .jsonl Alternative MIME type
text/jsonl .jsonl Text-based MIME type
application/ndjson .ndjson Newline Delimited JSON
application/x-ndjson .ndjson Alternative NDJSON MIME type

Recommended Usage

Use application/jsonl as the primary MIME type and .jsonl as the file extension for maximum compatibility.

4. Standards & RFCs

JSON Standard

  • RFC 7159: The JavaScript Object Notation (JSON) Data Interchange Format
  • ECMA-404: The JSON Data Interchange Standard
  • ISO/IEC 21778: Information technology — JSON data interchange format

Line Delimited Formats

  • RFC 7464: JavaScript Object Notation (JSON) Text Sequences
  • NDJSON: Newline Delimited JSON specification
  • JSONL: JSON Lines community standard

Compliance

JSONL files must comply with the JSON specification (RFC 7159) for each individual line, with the additional constraint that each line must be a complete, valid JSON object.

5. Validation Rules

Valid JSONL

valid.jsonl
{"id": 1, "name": "Alice"}
{"id": 2, "name": "Bob"}
{"id": 3, "name": "Charlie"}

Invalid JSONL

invalid.jsonl
{"id": 1, "name": "Alice"},  // Trailing comma
{"id": 2, "name": "Bob"}
[{"id": 3, "name": "Charlie"}]  // Array wrapper

Common Validation Errors

  • • Trailing commas in JSON objects
  • • Array wrapping around the entire file
  • • Incomplete JSON objects split across lines
  • • Invalid JSON syntax in individual lines
  • • Missing newline characters between records

6. Character Encoding

UTF-8 (Recommended)

Full Unicode support, web standard, most compatible

UTF-16

Supported but not recommended for web use

ASCII

Limited character set, not recommended

Encoding Detection

encoding-example
# HTTP Header
Content-Type: application/jsonl; charset=utf-8

# BOM (Byte Order Mark) - Optional
EF BB BF {"name": "Test"}

7. Examples & Use Cases

Log Files

logs.jsonl
{"timestamp": "2024-01-15T10:30:00Z", "level": "INFO", "message": "User logged in", "userId": 123}
{"timestamp": "2024-01-15T10:31:15Z", "level": "ERROR", "message": "Database connection failed", "error": "Connection timeout"}
{"timestamp": "2024-01-15T10:32:00Z", "level": "INFO", "message": "User logged out", "userId": 123}

Data Streaming

events.jsonl
{"event": "page_view", "url": "/products", "userId": 456, "timestamp": 1705312200}
{"event": "add_to_cart", "productId": "ABC123", "quantity": 2, "userId": 456, "timestamp": 1705312250}
{"event": "purchase", "orderId": "ORD789", "total": 99.99, "userId": 456, "timestamp": 1705312300}

Machine Learning Datasets

training-data.jsonl
{"text": "This product is amazing!", "label": "positive", "confidence": 0.95}
{"text": "Terrible quality, would not recommend", "label": "negative", "confidence": 0.88}
{"text": "It's okay, nothing special", "label": "neutral", "confidence": 0.72}

8. Implementation Guidelines

Reading JSONL

Python

import json

with open('data.jsonl', 'r') as f:
    for line in f:
        data = json.loads(line.strip())
        process(data)

JavaScript

const fs = require('fs');
const readline = require('readline');

const fileStream = fs.createReadStream('data.jsonl');
const rl = readline.createInterface({
  input: fileStream,
  crlfDelay: Infinity
});

rl.on('line', (line) => {
  const data = JSON.parse(line);
  process(data);
});

Writing JSONL

Python

import json

data = [{"name": "Alice"}, {"name": "Bob"}]

with open('output.jsonl', 'w') as f:
    for record in data:
        f.write(json.dumps(record) + '\n')

JavaScript

const fs = require('fs');

const data = [{"name": "Alice"}, {"name": "Bob"}];

const jsonl = data
  .map(record => JSON.stringify(record))
  .join('\n');

fs.writeFileSync('output.jsonl', jsonl);

Best Practices

  • • Always validate each line as valid JSON before processing
  • • Use streaming for large files to avoid memory issues
  • • Handle encoding properly (UTF-8 recommended)
  • • Include proper error handling for malformed lines
  • • Consider compression for large datasets

For more detailed best practices, see our JSONL Best Practices guide.

JSONL Tools

Use our free online tools to work with JSONL files:

This specification is based on community standards and RFC 7159 (JSON).