Piyush Dubey
piyushdubey
Coding etc.
Languages
Repos
93
Stars
0
Forks
1
Top Language
Java
Loading contributions...
Repositories
93Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
No description provided.
Tools of the trade
dotfiles for Mac
A comprehensive learning roadmap for mastering the core disciplines necessary for successful sole algorithmic trading. This repository serves as a structured template, guiding users through essential topics in software engineering, data science, machine learning, and finance. It combines certifications, curated resources, and hands-on projects.
Let Claude manage your tastytrade portfolio.
A Model Context Protocol server for building an investor agent
🦜🔗 Build context-aware reasoning applications
Apache Iceberg
No description provided.
Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.
Fun Iceberg Tools
A native Delta implementation for integration with any query engine
Apache Hive
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
DuckDB is an analytical in-process SQL database management system
🍻 Default formulae for the missing package manager for macOS (or Linux)
PyIceberg
Simple Windows desktop application for viewing & querying Apache Parquet files
Virtual whiteboard for sketching hand-drawn like diagrams
My blog space
Experiments with Data Formats [Iceberg, Delta, Hudi]
Apache Polaris, the interoperable, open source catalog for Apache Iceberg
Upserts, Deletes And Incremental Processing on Big Data.
No description provided.
No description provided.
A cross platform way to express data transformation, relational algebra, standardized record expression and plans.
Apache Spark - A unified analytics engine for large-scale data processing
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs