19 results for “topic:cloud-data-warehouse”
Building a next-generation hybrid data pipeline architecture that combines the power of Microsoft Fabric, Azure Cloud, and Power BI. This pipeline is engineered to tackle the challenges of real-time data ingestion, multi-layered processing, and analytics, delivering business-critical insights.
Data engineering practice, including building data pipelines (ELT) from a variety of sources.
Building an ETL pipeline that extracts data from S3, stages them in Redshift.
This project builds a cloud-based ETL pipeline for Sparkify to move data to a cloud data warehouse. It extracts song and user activity data from AWS S3, stages it in Redshift, and transforms it into a star-schema data model with fact and dimension tables, enabling efficient querying to answer business questions.
Summary/Notes of Snowflake cloud data warehouse. (Complete ✅)
Hands-on project covering Snowflake data loading with custom file formats, validation modes, error handling, string length limits, TRUNCATECOLUMNS, and analyzing load history using account_usage.load_history.
Error Handling Hands-on project showcasing Snowflake data loading with error handling using VALIDATION_MODE, ON_ERROR = CONTINUE, ON_ERROR = SKIP_FILE, and ON_ERROR = SKIP_FILE_% while ingesting CSV files from AWS S3.
End-to-end pipeline analysing Yelp reviews using AWS S3, Snowflake, Python UDFs and advanced SQL sentiment analysis
This project demonstrates data sampling techniques in Snowflake. It covers loading datasets from S3, performing RANDOM and SYSTEM sampling methods to extract subsets, validating sampled data, and optimizing analysis on datasets.
The objective of this task is to create and configure a new virtual warehouse in Snowflake. Warehouses are crucial for query execution and data processing, as they provide the compute resources required to run SQL statements.
NDA-safe migration framework: Python quality gate + Snowflake-ready canonical ODS
moved, cleaned, and transformed data stored in S3 as json to Redshift.
This project demonstrates Snowflake Streams for change data capture. It covers creating streams to track INSERT, UPDATE, and DELETE operations on tables, loading data from S3, querying captured changes, and managing stream objects for real-time data monitoring.
This project explores Snowflake’s Time Travel feature, including querying historical data using offsets, retention periods, and query IDs. It demonstrates restoring previous table states after updates, managing retention settings, and recovering data efficiently.
This project explores Snowflake’s table types, including Permanent, Temporary, Transient, and External tables. It demonstrates creating tables, loading data from S3 stages, querying and validating data, and understanding differences in persistence, retention, and Time Travel support.
This project demonstrates how to use Snowflake stages for loading data from Amazon S3 into Snowflake tables. It also covers applying transformations during loading and selecting only specific columns from the source data.
Automating Data Workflows in Snowflake with Task Scheduling & Management.
Building a cloud data warehouse with AWS Redshift.
This project demonstrates Snowflake table cloning and swapping techniques. It covers creating original and cloned tables, loading data from S3, verifying cloned data, and performing table swaps to efficiently exchange data between staging and production tables.