Repositories
34A C++ columnar query engine implementing all 22 TPC-H benchmark queries using a custom DSL
A MkDocs plugin that adds a "Copy to LLM" button to your documentation, making it easy to copy code blocks and entire pages in formats optimized for Large Language Models (LLMs).
OpenTofu lets you declaratively manage your cloud infrastructure.
The Sphinx documentation generator
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
API for manipulating time series on top of Apache Spark: lagged time values, rolling statistics (mean, avg, sum, count, etc), AS OF joins, downsampling, and interpolation
Recap is a dead simple data catalog for engineers
Databricks SDK for Python (Beta)
:art: Diagram as Code for prototyping cloud system architectures
All the bufo emojis you could possibly ask for
Magic to help Spark pipelines upgrade
Delta Lake examples
No description provided.
Golang database/sql driver for Databricks SQL.
Databricks SDK for Go
Scala examples for learning to use Spark
Self-contained examples using Apache Spark with the functional features of Java 8
Self-contained examples of Apache Spark streaming integrated with Apache Kafka.
Developing Spark External Data Sources using the V2 API
A native Rust library for Delta Lake, with bindings into Python and Ruby.
Databricks Terraform Provider
No description provided.
Object-Oriented Programming in Python
CLI tool for advanced Databricks jobs management.
🍺 The missing package manager for macOS (or Linux)
No description provided.
A non-validating SQL parser module for Python
Faker is a Python package that generates fake data for you.
Pattern Matching for Python 3.7+ in a simple, yet powerful, extensible manner.
A SQL linter and auto-formatter for Humans