Flavio Junqueira
fpj
Distributed Systems, ZooKeeper, BookKeeper, Kafka, @apache. In a previous life: Yahoo! Research, Microsoft Research, @confluentinc, and Dell.
Languages
Top Repositories
This is a code example that complements the material in the ZooKeeper O'Reilly book.
Pravega - Streaming as a new software defined storage primitive
A high performance replicated log service.
Mirror of Apache Bookkeeper
The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
F3: The Open-Source Data File Format for the Future
Repositories
80This is a code example that complements the material in the ZooKeeper O'Reilly book.
Mirror of Apache Bookkeeper
The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
F3: The Open-Source Data File Format for the Future
Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
Official Java implementation of Apache Arrow
An extensible, state of the art columnar file format. Formerly at @spiraldb, now an Incubation Stage project at LFAI&Data, part of the Linux Foundation.
No description provided.
TLC is a model checker for specifications written in TLA+. The TLA+Toolbox is an IDE for TLA+.
A benchmark for serverless analytic databases.
No description provided.
Open Control Plane for Tables in Data Lakehouse
Mirror of Apache Spark
A composable and fully extensible C++ execution engine library for data management systems.
Apache DataFusion SQL Query Engine
Pravega - Streaming as a new software defined storage primitive
Apache Iceberg
LST-Bench is a framework that allows users to run benchmarks specifically designed for evaluating Log-Structured Tables (LSTs) such as Delta Lake, Apache Hudi, and Apache Iceberg.
Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.
Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!
No description provided.
No description provided.
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Code and YAML files for Getting Started with Kubernetes video course on Pluralsight
Nessie: Transactional Catalog for Data Lakes with Git-like semantics
Confluent Platform system tests
MVT (Mobile Verification Toolkit) helps with conducting forensics of mobile devices in order to find signs of a potential compromise.
Parsing and analysis of Vertica, Hive, and Presto SQL.
No description provided.
A high performance replicated log service.