150 results for “topic:data-catalog”
The Metadata Platform for your Data and AI Stack
OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
Intake is a lightweight package for finding, investigating, loading and disseminating data.
📙 Awesome Data Catalogs and Observability Platforms.
🐳 The stupidly simple CLI workspace for your data warehouse.
Marmot helps teams discover, understand, and leverage their data with powerful search and lineage visualisation tools. It's designed to make data accessible for everyone.
Work with your web service, database, and streaming schemas in a single format.
Scan databases and data warehouses for PII data. Tag tables and columns in data catalogs like Amundsen and Datahub
Meteor is an easy-to-use, plugin-driven metadata collection framework to extract data from different sources and sink to any data catalog.
An intake plugin for parsing an Earth System Model (ESM) catalog and loading assets into xarray datasets.
Reference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.
The GenAI-powered toolkit for automated data intelligence.
Metamapper is a data discovery and documentation platform for improving how teams understand and interact with their data.
Reference Architectures for Datalakes on AWS
Sample code with integration between Data Catalog and RDBMS data sources.
End-to-end DataOps platform deployed by Terraform.
The Google Earth Engine data catalog in CSV format
Supercharged Replication for Developers
Registry of data portals, catalogs, data repositories including data catalogs dataset and catalog description standard
Data catalog for everything in your company
The documentation repository is part of the Corporate Linked Data Catalog - short: COLID - application.
National Data Archive (NADA) is an open source data cataloging system that serves as a portal for researchers to browse, search, compare, apply for access, and download relevant census or survey information. It was originally developed to support the establishment of national survey data archives.
Open-source metadata collector based on ODD Specification
Open-source agentic schema CLI. Optimised for claude code, gemini, codex and co-pilot. Skills included.
A curated list of awesome open source tools and commercial products to catalog, version, and manage data 🚀
Data policy IN, dynamic view OUT: PACE is the Policy As Code Engine. It helps you to programatically create and apply a data policy to a processing platform like Databricks, Snowflake or BigQuery (or plain 'ol Postgres, even!) with definitions imported from Collibra, Datahub, ODD and the like.
Sample code with integration between Data Catalog and BI data sources.