339 results for “topic:data-ingestion”
SeaTunnel is a multimodal, high-performance, distributed, massive data integration tool.
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
Concurrent and multi-stage data ingestion and data processing with Elixir
Pravega - Streaming as a new software defined storage primitive
Build data pipelines with SQL and Python, ingest data from different sources, add quality checks, and build end-to-end flows.
Copy to/from Parquet in S3, Azure Blob Storage, Google Cloud Storage, http(s) stores, local files or standard inout stream from within PostgreSQL
The Supabase of AI era. A modular, open-source backend for building AI-native software — designed for knowledge, not static data.
Orbital automates integration between data sources (APIs, Databases, Queues and Functions). BFF's, API Composition and ETL pipelines that adapt as your specs change.
Use SQL to build ELT pipelines on a data lakehouse.
Apache Paimon Rust The rust implementation of Apache Paimon.
The Data Engineering Book - หนังสือวิศวกรรมข้อมูล ของคนไทย เพื่อคนไทย
Apache Spark examples exclusively in Java
Build complete API integrations with YAML and SQL. Rapid development without vendor lock-in and per-row costs.
No description provided.
Claude Skills for connecting Claude.ai to local Weaviate vector databases - manage collections, ingest data, and query with RAG
Enables custom tracing of Java applications in Dynatrace
Sample code for the AWS Big Data Blog Post Building a scalable streaming data processor with Amazon Kinesis Data Streams on AWS Fargate
Download and warehouse historical trading data
OpenKit Java Reference Implementation
No description provided.
The Data Integration Library project provides a library of generic components based on a multi-stage architecture for data ingress and egress.
Enables custom tracing of Python applications in Dynatrace
Oracle AI Data Platform Workbench Samples
Describes technical concepts of Dynatrace OneAgent SDK
Enables custom tracing of .NET applications in Dynatrace
SeaTunnel is a multimodal, high-performance, distributed, massive data integration tool.
Enables custom tracing of Node.js applications in Dynatrace
Convert documents, images to high-quality Markdown using Vision LLMs. Built for RAG ingestion pipelines.
A robust, configuration-driven ETL and data import framework for Laravel. Handles CSV/Excel streaming, queues, validation, and relationships.