rsonpath ā SIMD-powered JSONPath š 
Experimental JSONPath engine for querying massive streamed datasets.
The rsonpath crate provides a JSONPath parser and a query execution engine rq,
which utilizes SIMD instructions to provide massive throughput improvements over conventional engines.
Benchmarks of rsonpath against a reference no-SIMD engine on the
Pison dataset. NOTE: Scale is logarithmic!
Usage
To run a JSONPath query on a file execute:
rq '$..a.b' ./file.jsonIf the file is omitted, the engine reads standard input. JSON can also be passed inline:
$ rq '$..a.b' --json '{"c":{"a":{"b":42}}}'
42
For details, consult rq --help or the rsonbook.
Results
The result of running a query is a sequence of matched values, delimited by newlines.
Alternatively, passing --result count returns only the number of matches, which might be much faster.
For other result modes consult the --help usage page.
Installation
See Releases for precompiled binaries for
all first-class support targets.
cargo
Easiest way to install is via cargo.
$ cargo install rsonpath
...Native CPU optimizations
If maximum speed is paramount, you should install rsonpath with native CPU instructions support.
This will result in a binary that is not portable and might work incorrectly on any other machine,
but will squeeze out every last bit of throughput.
To do this, run the following cargo install variant:
$ RUSTFLAGS="-C target-cpu=native" cargo install rsonpath
...Check out the relevant chapter in the rsonbook.
Query language
The project is actively developed and currently supports only a subset of the JSONPath query language.
A query is a sequence of segments, each containing one or more selectors.
Supported segments
| Segment | Syntax | Supported | Since | Tracking Issue |
|---|---|---|---|---|
| Child segment (single) | [<selector>] |
āļø | v0.1.0 | |
| Child segment (multiple) | [<selector1>,...,<selectorN>] |
ā | ||
| Descendant segment (single) | ..[<selector>] |
āļø | v0.1.0 | |
| Descendant segment (multiple) | ..[<selector1>,...,<selectorN>] |
ā |
Supported selectors
| Selector | Syntax | Supported | Since | Tracking Issue |
|---|---|---|---|---|
| Root | $ |
āļø | v0.1.0 | |
| Name | .<member>, [<member>] |
āļø | v0.1.0 | |
| Wildcard | .*, ..*, [*] |
āļø | v0.4.0 | |
| Index (array index) | [<index>] |
āļø | v0.5.0 | |
| Index (array index from end) | [-<index>] |
ā | ||
| Array slice (forward, positive bounds) | [<start>:<end>:<step>] |
āļø | v0.9.0 | #152 |
| Array slice (forward, arbitrary bounds) | [<start>:<end>:<step>] |
ā | ||
| Array slice (backward, arbitrary bounds) | [<start>:<end>:-<step>] |
ā | ||
| Filters ā existential tests | [?<path>] |
ā | #154 | |
| Filters ā const atom comparisons | [?<path> <binop> <atom>] |
ā | #156 | |
| Filters ā logical expressions | &&, ||, ! |
ā | ||
| Filters ā nesting | [?<expr>[?<expr>]...] |
ā | ||
| Filters ā arbitrary comparisons | [?<path> <binop> <path>] |
ā | ||
| Filters ā function extensions | [?func(<path>)] |
ā |
Supported platforms
The crate is continuously built and tested for all Tier 1 Rust targets.
Pre-built binaries are also available for some Tier 2 targets, but without testing.
Currently, these are MUSL targets -- if you require other binaries create an issue.
SIMD is available on x86 and ARM (64-bit) platforms.
| Target triple | nosimd build | SIMD support | Continuous testing | Tracking issues |
|---|---|---|---|---|
| aarch64-apple-darwin | āļø | āļø | āļø | |
| aarch64-pc-windows-msvc | āļø | āļø | āļø | |
| aarch64-unknown-linux-gnu | āļø | āļø | āļø | |
| i686-pc-windows-msvc | āļø | āļø | āļø | |
| i686-unknown-linux-gnu | āļø | āļø | āļø | |
| x86_64-pc-windows-gnu | āļø | āļø | āļø | |
| x86_64-pc-windows-msvc | āļø | āļø | āļø | |
| x86_64-unknown-linux-gnu | āļø | āļø | āļø | |
| aarch64-unknown-linux-musl | āļø | āļø | ā | |
| i686-unknown-linux-musl | āļø | āļø | ā | |
| x86_64-unknown-linux-musl | āļø | āļø | ā |
SIMD support
SIMD support is enabled on a module-by-module basis. Generally, any CPU released in the past
decade supports AVX2, which enables all available optimizations. On ARM, we support NEON.
Older CPUs with SSE2 or higher get partial support. You can check what exactly is enabled
with rq --version ā check the SIMD support field:
$ rq --version
rq 0.9.1
Commit SHA: c024e1bab89610455537b77aed249d2a05a81ed6
Features: default,simd
Opt level: 3
Target triple: x86_64-unknown-linux-gnu
Codegen flags: link-arg=-fuse-ld=lld
SIMD support: avx2;fast_quotes;fast_popcntThe fast_quotes capability depends on the pclmulqdq instruction (on x86) or the aes feature (ARM),
and fast_popcnt on the popcnt instruction (always available on ARM).
Caveats and limitations
JSONPath
Not all selectors are supported, see the support table above.
Duplicate keys
The engine assumes that every object in the input JSON has no duplicate keys.
Behavior on duplicate keys is not guaranteed to be stable, but currently
the engine will simply match the first such key.
$ rq '$.key' --json '{"key":"value","key":"other value"}'
"value"
Unicode
The engine does not parse unicode escape sequences in member names.
This means that a key "a" is different from a key "\u0041", even though semantically they represent the same string.
This is actually as-designed with respect to the current JSONPath spec.
Parsing unicode sequences is costly, so the support for this was postponed
in favour of high performance. This is tracked as #117.
Contributing
The gist is: fork, implement, make a PR back here. More details are in the CONTRIBUTING doc.
Build & test
The dev workflow utilizes just.
Use the included Justfile. It will automatically install Rust for you using the rustup tool if it detects there is no Cargo in your environment.
$ just build
...
$ just test
...Benchmarks
Benchmarks for rsonpath are located in the benchmark crate of this repository.
Easiest way to run all the benchmarks is just bench within the directory crates/rsonpath-benchmarks . For details, look at the README in this directory.
Background
We have a paper on rsonpath to be published at ASPLOS '24! You can read it
here.
This project was conceived as my thesis. You can read it for details on the theoretical
background on the engine and details of its implementation.
We also have a short talk from ASPLOS 2024 about rsonpath!
https://gienieczko.com/asplos-2024-talk.mp4
(excuse the audio quality, the sound in the source video was corrupted and we had to salvage)
Dependencies
Showing direct dependencies, for full graph see below.
cargo tree --package rsonpath --edges normal --depth 1rsonpath v0.10.0 (/home/mat/src/rsonpath/crates/rsonpath)
āāā clap v4.5.58
āāā color-eyre v0.6.5
āāā eyre v0.6.12
āāā log v0.4.29
āāā rsonpath-lib v0.10.0 (/home/mat/src/rsonpath/crates/rsonpath-lib)
āāā rsonpath-syntax v0.4.0 (/home/mat/src/rsonpath/crates/rsonpath-syntax)
āāā simple_logger v5.1.0
[build-dependencies]
āāā rustflags v0.1.7
āāā vergen v9.1.0
ā [build-dependencies]
āāā vergen-git2 v9.1.0
[build-dependencies]cargo tree --package rsonpath-lib --edges normal --depth 1rsonpath-lib v0.10.0 (/home/mat/src/rsonpath/crates/rsonpath-lib)
āāā cfg-if v1.0.4
āāā log v0.4.29
āāā memmap2 v0.9.10
āāā rsonpath-syntax v0.4.0 (/home/mat/src/rsonpath/crates/rsonpath-syntax)
āāā serde v1.0.228
āāā smallvec v1.15.1
āāā static_assertions v1.1.0
āāā thiserror v2.0.18
āāā vector-map v1.0.2Justification
clapā standard crate to provide the CLI.color-eyre,eyreā more accessible error messages for the parser.log,simple-loggerā diagnostic logs during compilation and execution.cfg-ifā used to support SIMD and no-SIMD versions.memmap2ā for fast reading of source files via a memory map instead of buffered copies.nomā for parser implementation.smallvecā crucial for small-stack performance.static_assertionsā additional reliability by some constant assumptions validated at compile time.thiserrorā idiomaticErrorimplementations.vector_mapā used in the query compiler for measurably better performance.
Full dependency tree
cargo tree --package rsonpath --edges normalrsonpath v0.10.0 (/home/mat/src/rsonpath/crates/rsonpath)
āāā clap v4.5.58
ā āāā clap_builder v4.5.58
ā ā āāā anstream v0.6.21
ā ā ā āāā anstyle v1.0.13
ā ā ā āāā anstyle-parse v0.2.7
ā ā ā ā āāā utf8parse v0.2.2
ā ā ā āāā anstyle-query v1.1.5
ā ā ā ā āāā windows-sys v0.61.2
ā ā ā ā āāā windows-link v0.2.1
ā ā ā āāā anstyle-wincon v3.0.11
ā ā ā ā āāā anstyle v1.0.13
ā ā ā ā āāā once_cell_polyfill v1.70.2
ā ā ā ā āāā windows-sys v0.61.2 (*)
ā ā ā āāā colorchoice v1.0.4
ā ā ā āāā is_terminal_polyfill v1.70.2
ā ā ā āāā utf8parse v0.2.2
ā ā āāā anstyle v1.0.13
ā ā āāā clap_lex v1.0.0
ā ā āāā strsim v0.11.1
ā ā āāā terminal_size v0.4.3
ā ā āāā rustix v1.1.3
ā ā ā āāā bitflags v2.11.0
ā ā ā āāā errno v0.3.14
ā ā ā ā āāā libc v0.2.182
ā ā ā ā āāā windows-sys v0.61.2 (*)
ā ā ā āāā libc v0.2.182
ā ā ā āāā linux-raw-sys v0.11.0
ā ā ā āāā windows-sys v0.61.2 (*)
ā ā āāā windows-sys v0.60.2
ā ā āāā windows-targets v0.53.5
ā ā āāā windows-link v0.2.1
ā ā āāā windows_aarch64_gnullvm v0.53.1
ā ā āāā windows_aarch64_msvc v0.53.1
ā ā āāā windows_i686_gnu v0.53.1
ā ā āāā windows_i686_gnullvm v0.53.1
ā ā āāā windows_i686_msvc v0.53.1
ā ā āāā windows_x86_64_gnu v0.53.1
ā ā āāā windows_x86_64_gnullvm v0.53.1
ā ā āāā windows_x86_64_msvc v0.53.1
ā āāā clap_derive v4.5.55 (proc-macro)
ā āāā heck v0.5.0
ā āāā proc-macro2 v1.0.106
ā ā āāā unicode-ident v1.0.23
ā āāā quote v1.0.44
ā ā āāā proc-macro2 v1.0.106 (*)
ā āāā syn v2.0.116
ā āāā proc-macro2 v1.0.106 (*)
ā āāā quote v1.0.44 (*)
ā āāā unicode-ident v1.0.23
āāā color-eyre v0.6.5
ā āāā backtrace v0.3.76
ā ā āāā addr2line v0.25.1
ā ā ā āāā gimli v0.32.3
ā ā āāā cfg-if v1.0.4
ā ā āāā libc v0.2.182
ā ā āāā miniz_oxide v0.8.9
ā ā ā āāā adler2 v2.0.1
ā ā āāā object v0.37.3
ā ā ā āāā memchr v2.8.0
ā ā āāā rustc-demangle v0.1.27
ā ā āāā windows-link v0.2.1
ā āāā eyre v0.6.12
ā ā āāā indenter v0.3.4
ā ā āāā once_cell v1.21.3
ā āāā indenter v0.3.4
ā āāā once_cell v1.21.3
ā āāā owo-colors v4.2.3
āāā eyre v0.6.12 (*)
āāā log v0.4.29
āāā rsonpath-lib v0.10.0 (/home/mat/src/rsonpath/crates/rsonpath-lib)
ā āāā cfg-if v1.0.4
ā āāā log v0.4.29
ā āāā memmap2 v0.9.10
ā ā āāā libc v0.2.182
ā āāā rsonpath-syntax v0.4.0 (/home/mat/src/rsonpath/crates/rsonpath-syntax)
ā ā āāā nom v8.0.0
ā ā ā āāā memchr v2.8.0
ā ā āāā owo-colors v4.2.3
ā ā āāā thiserror v2.0.18
ā ā ā āāā thiserror-impl v2.0.18 (proc-macro)
ā ā ā āāā proc-macro2 v1.0.106 (*)
ā ā ā āāā quote v1.0.44 (*)
ā ā ā āāā syn v2.0.116 (*)
ā ā āāā unicode-width v0.2.2
ā āāā smallvec v1.15.1
ā āāā static_assertions v1.1.0
ā āāā thiserror v2.0.18 (*)
ā āāā vector-map v1.0.2
āāā rsonpath-syntax v0.4.0 (/home/mat/src/rsonpath/crates/rsonpath-syntax) (*)
āāā simple_logger v5.1.0
āāā colored v3.1.1
ā āāā windows-sys v0.61.2 (*)
āāā log v0.4.29
āāā time v0.3.47
ā āāā deranged v0.5.6
ā ā āāā powerfmt v0.2.0
ā āāā itoa v1.0.17
ā āāā libc v0.2.182
ā āāā num-conv v0.2.0
ā āāā num_threads v0.1.7
ā ā āāā libc v0.2.182
ā āāā powerfmt v0.2.0
ā āāā time-core v0.1.8
ā āāā time-macros v0.2.27 (proc-macro)
ā āāā num-conv v0.2.0
ā āāā time-core v0.1.8
āāā windows-sys v0.61.2 (*)
[build-dependencies]
āāā rustflags v0.1.7
āāā vergen v9.1.0
ā āāā anyhow v1.0.101
ā āāā cargo_metadata v0.23.1
ā ā āāā camino v1.2.2
ā ā ā āāā serde_core v1.0.228
ā ā ā āāā serde_derive v1.0.228 (proc-macro)
ā ā ā āāā proc-macro2 v1.0.106 (*)
ā ā ā āāā quote v1.0.44 (*)
ā ā ā āāā syn v2.0.116 (*)
ā ā āāā cargo-platform v0.3.2
ā ā ā āāā serde v1.0.228
ā ā ā ā āāā serde_core v1.0.228 (*)
ā ā ā ā āāā serde_derive v1.0.228 (proc-macro) (*)
ā ā ā āāā serde_core v1.0.228 (*)
ā ā āāā semver v1.0.27
ā ā ā āāā serde v1.0.228 (*)
ā ā ā āāā serde_core v1.0.228 (*)
ā ā āāā serde v1.0.228 (*)
ā ā āāā serde_json v1.0.149
ā ā ā āāā itoa v1.0.17
ā ā ā āāā memchr v2.8.0
ā ā ā āāā serde v1.0.228 (*)
ā ā ā āāā serde_core v1.0.228 (*)
ā ā ā āāā zmij v1.0.21
ā ā āāā thiserror v2.0.18 (*)
ā āāā derive_builder v0.20.2
ā ā āāā derive_builder_macro v0.20.2 (proc-macro)
ā ā āāā derive_builder_core v0.20.2
ā ā ā āāā darling v0.20.11
ā ā ā ā āāā darling_core v0.20.11
ā ā ā ā ā āāā fnv v1.0.7
ā ā ā ā ā āāā ident_case v1.0.1
ā ā ā ā ā āāā proc-macro2 v1.0.106 (*)
ā ā ā ā ā āāā quote v1.0.44 (*)
ā ā ā ā ā āāā strsim v0.11.1
ā ā ā ā ā āāā syn v2.0.116 (*)
ā ā ā ā āāā darling_macro v0.20.11 (proc-macro)
ā ā ā ā āāā darling_core v0.20.11 (*)
ā ā ā ā āāā quote v1.0.44 (*)
ā ā ā ā āāā syn v2.0.116 (*)
ā ā ā āāā proc-macro2 v1.0.106 (*)
ā ā ā āāā quote v1.0.44 (*)
ā ā ā āāā syn v2.0.116 (*)
ā ā āāā syn v2.0.116 (*)
ā āāā regex v1.12.3
ā ā āāā aho-corasick v1.1.4
ā ā ā āāā memchr v2.8.0
ā ā āāā memchr v2.8.0
ā ā āāā regex-automata v0.4.14
ā ā ā āāā aho-corasick v1.1.4 (*)
ā ā ā āāā memchr v2.8.0
ā ā ā āāā regex-syntax v0.8.9
ā ā āāā regex-syntax v0.8.9
ā āāā rustc_version v0.4.1
ā ā āāā semver v1.0.27 (*)
ā āāā vergen-lib v9.1.0
ā āāā anyhow v1.0.101
ā āāā derive_builder v0.20.2 (*)
ā [build-dependencies]
ā āāā rustversion v1.0.22 (proc-macro)
ā [build-dependencies]
ā āāā rustversion v1.0.22 (proc-macro)
āāā vergen-git2 v9.1.0
āāā anyhow v1.0.101
āāā derive_builder v0.20.2 (*)
āāā git2 v0.20.4
ā āāā bitflags v2.11.0
ā āāā libc v0.2.182
ā āāā libgit2-sys v0.18.3+1.9.2
ā ā āāā libc v0.2.182
ā ā āāā libz-sys v1.1.23
ā ā āāā libc v0.2.182
ā ā [build-dependencies]
ā ā āāā cc v1.2.56
ā ā ā āāā find-msvc-tools v0.1.9
ā ā ā āāā jobserver v0.1.34
ā ā ā ā āāā getrandom v0.3.4
ā ā ā ā ā āāā cfg-if v1.0.4
ā ā ā ā ā āāā libc v0.2.182
ā ā ā ā ā āāā r-efi v5.3.0
ā ā ā ā ā āāā wasip2 v1.0.2+wasi-0.2.9
ā ā ā ā ā āāā wit-bindgen v0.51.0
ā ā ā ā āāā libc v0.2.182
ā ā ā āāā libc v0.2.182
ā ā ā āāā shlex v1.3.0
ā ā āāā pkg-config v0.3.32
ā ā āāā vcpkg v0.2.15
ā ā [build-dependencies]
ā ā āāā cc v1.2.56 (*)
ā ā āāā pkg-config v0.3.32
ā āāā log v0.4.29
ā āāā url v2.5.8
ā āāā form_urlencoded v1.2.2
ā ā āāā percent-encoding v2.3.2
ā āāā idna v1.1.0
ā ā āāā idna_adapter v1.2.1
ā ā ā āāā icu_normalizer v2.1.1
ā ā ā ā āāā icu_collections v2.1.1
ā ā ā ā ā āāā displaydoc v0.2.5 (proc-macro)
ā ā ā ā ā ā āāā proc-macro2 v1.0.106 (*)
ā ā ā ā ā ā āāā quote v1.0.44 (*)
ā ā ā ā ā ā āāā syn v2.0.116 (*)
ā ā ā ā ā āāā potential_utf v0.1.4
ā ā ā ā ā ā āāā zerovec v0.11.5
ā ā ā ā ā ā āāā yoke v0.8.1
ā ā ā ā ā ā ā āāā stable_deref_trait v1.2.1
ā ā ā ā ā ā ā āāā yoke-derive v0.8.1 (proc-macro)
ā ā ā ā ā ā ā ā āāā proc-macro2 v1.0.106 (*)
ā ā ā ā ā ā ā ā āāā quote v1.0.44 (*)
ā ā ā ā ā ā ā ā āāā syn v2.0.116 (*)
ā ā ā ā ā ā ā ā āāā synstructure v0.13.2
ā ā ā ā ā ā ā ā āāā proc-macro2 v1.0.106 (*)
ā ā ā ā ā ā ā ā āāā quote v1.0.44 (*)
ā ā ā ā ā ā ā ā āāā syn v2.0.116 (*)
ā ā ā ā ā ā ā āāā zerofrom v0.1.6
ā ā ā ā ā ā ā āāā zerofrom-derive v0.1.6 (proc-macro)
ā ā ā ā ā ā ā āāā proc-macro2 v1.0.106 (*)
ā ā ā ā ā ā ā āāā quote v1.0.44 (*)
ā ā ā ā ā ā ā āāā syn v2.0.116 (*)
ā ā ā ā ā ā ā āāā synstructure v0.13.2 (*)
ā ā ā ā ā ā āāā zerofrom v0.1.6 (*)
ā ā ā ā ā ā āāā zerovec-derive v0.11.2 (proc-macro)
ā ā ā ā ā ā āāā proc-macro2 v1.0.106 (*)
ā ā ā ā ā ā āāā quote v1.0.44 (*)
ā ā ā ā ā ā āāā syn v2.0.116 (*)
ā ā ā ā ā āāā yoke v0.8.1 (*)
ā ā ā ā ā āāā zerofrom v0.1.6 (*)
ā ā ā ā ā āāā zerovec v0.11.5 (*)
ā ā ā ā āāā icu_normalizer_data v2.1.1
ā ā ā ā āāā icu_provider v2.1.1
ā ā ā ā ā āāā displaydoc v0.2.5 (proc-macro) (*)
ā ā ā ā ā āāā icu_locale_core v2.1.1
ā ā ā ā ā ā āāā displaydoc v0.2.5 (proc-macro) (*)
ā ā ā ā ā ā āāā litemap v0.8.1
ā ā ā ā ā ā āāā tinystr v0.8.2
ā ā ā ā ā ā ā āāā displaydoc v0.2.5 (proc-macro) (*)
ā ā ā ā ā ā ā āāā zerovec v0.11.5 (*)
ā ā ā ā ā ā āāā writeable v0.6.2
ā ā ā ā ā ā āāā zerovec v0.11.5 (*)
ā ā ā ā ā āāā writeable v0.6.2
ā ā ā ā ā āāā yoke v0.8.1 (*)
ā ā ā ā ā āāā zerofrom v0.1.6 (*)
ā ā ā ā ā āāā zerotrie v0.2.3
ā ā ā ā ā ā āāā displaydoc v0.2.5 (proc-macro) (*)
ā ā ā ā ā ā āāā yoke v0.8.1 (*)
ā ā ā ā ā ā āāā zerofrom v0.1.6 (*)
ā ā ā ā ā āāā zerovec v0.11.5 (*)
ā ā ā ā āāā smallvec v1.15.1
ā ā ā ā āāā zerovec v0.11.5 (*)
ā ā ā āāā icu_properties v2.1.2
ā ā ā āāā icu_collections v2.1.1 (*)
ā ā ā āāā icu_locale_core v2.1.1 (*)
ā ā ā āāā icu_properties_data v2.1.2
ā ā ā āāā icu_provider v2.1.1 (*)
ā ā ā āāā zerotrie v0.2.3 (*)
ā ā ā āāā zerovec v0.11.5 (*)
ā ā āāā smallvec v1.15.1
ā ā āāā utf8_iter v1.0.4
ā āāā percent-encoding v2.3.2
āāā time v0.3.47
ā āāā deranged v0.5.6 (*)
ā āāā itoa v1.0.17
ā āāā libc v0.2.182
ā āāā num-conv v0.2.0
ā āāā num_threads v0.1.7 (*)
ā āāā powerfmt v0.2.0
ā āāā time-core v0.1.8
āāā vergen v9.1.0 (*)
āāā vergen-lib v9.1.0 (*)
[build-dependencies]
āāā rustversion v1.0.22 (proc-macro)