Files
reveal.js/examples/markdown.md

3.0 KiB

JSONite: High-Performance Embedded Database for Semi-Structured Data


The JSON Performance Crisis

JSON is Everywhere:

  • Web APIS, IoT, logs, configurations
  • Semi-structured, flexible, human-readable

But Current Solutions Fail:

  • Large Databases: People use MongoDB or PostgreSQL's JSONB to store data
  • Embeded Databases: RocksDB and PoloDB lack of ACID and SQL support
  • Serialization to String: Or serialize JSON into strings and store in SQLite

Serialized JSON with SQL example

insert into http_request_log (ip, headers)
values ('127.0.0.1', '{
    "Content-Type": "application/oct-stream",
    "X-Forwarded-For": "100.64.0.1",
}');

Introducing JSONite

Best of Both Worlds:

  • SQLite's based
  • Native JSON optimization

Key Advantages:

  • ACID compliance
  • SQL simplicity
  • Serverless C library
  • Lightning-fast JSON access

Smart Key Optimization

Key Sorting by Length:

{
  "id": 1,
  "address": {...}
  "name": "John",
  "email": "john@example.com",
}

Sorted as:

{
  "id", (2 chars)
  "name", (4 chars)  
  "email", (5 chars)
  "address", (7 chars)
}

Binary search on length → Fast lookups


Handling Massive Data: Smart TOAST

The Oversized-Attribute Storage Technique

  • Standard approach: arbitrary chunking
  • JSONite's innovation: Data-Type Aware TOAST

Intelligent Chunking:

  • Arrays split between elements
  • Objects split between key-value pairs
  • Text falls back to fixed chunks

Enables "Slice Detoasting":

  • $.logs[1000000:1000010] fetches only 10 elements
  • Not the entire multi-gigabyte array

Smart Chunking Example

{
    "id": 1,
    "title": "some text",
    "html": <pointer to TOAST of 200k text>,
    "photos": [<pointer to TOAST of binary data>],
    "crawl_logs": [<pointer to TOAST of array of texts>]
}

Query Power

Full SQL + JSON Support:

  • PostgreSQL-compatible JSONB path operators
  • GIN indexes for instant search
SELECT *
FROM accounts
WHERE data @> '{"status": "active"}'

Performance Validation: Benchmark Datasets

Three Specialized Workloads:

  1. YCSB-Style Read Benchmark

    • Yahoo! Cloud Serving Benchmark
    • 1M JSON documents (1KB-100KB each)
  2. TPC-C Inspired Update Benchmark

    • Transaction Processing Performance Council
    • 100K transactional JSON records
    • Frequent small field updates
  3. Large-Array Slice Benchmark

    • Multi-gigabyte JSON documents
    • Massive arrays (10M+ elements)

Comparison Targets: SQLite JSONB vs MongoDB vs PostgreSQL vs JSONite


JSONite: The Future of Embedded Data Storage

Why It Matters Today:

  • Edge Computing: Lightweight, handles sensor data efficiently
  • Modern Apps: SQL power + JSON flexibility, no schema migrations

The Vision:

  • Open source implementation
  • Community-driven development
  • Becoming the default choice for embedded JSON storage
  • Bridging SQL reliability with NoSQL flexibility

Thank You

Questions?

CHEN Yongyuan
2025-11-01