GitHunt
KA

kameshsampath/snowpark-widf-lambda-loader-demo

Keyless ETL demo: Load data from S3 to Snowflake using AWS Lambda with Workload Identity Federation (WIDF) โ€” no passwords, no secrets, just IAM trust.

๐Ÿ” Snowpark WIDF Lambda Loader Demo

Keyless ETL: AWS Lambda โ†’ Snowflake

Demonstrates Snowflake Workload Identity Federation (WIDF) with AWS Lambda. No passwords, no secrets, no key pairs - just IAM trust.

Why WIDF?

Traditional WIDF
Credentials Password or key pair stored in Secrets Manager None - IAM role is the identity
Rotation Manual or automated rotation required Nothing to rotate
Leak risk Secrets can be exposed in logs, env vars No secrets to leak
Audit "Which service used this password?" "IAM role X accessed at time Y"
Setup Create user + password + store secret Create user + trust IAM ARN
Cost Secrets Manager charges per secret/access Free

When to Use This

Snowflake offers multiple ways to load data from S3. Choose based on your needs:

Approach Pattern Best For
External Stage + Snowpipe Snowflake pulls Bulk file ingestion, raw landing zones
Lambda + WIDF You push Smart ingestion with pre-load logic

Lambda + WIDF shines when you need:

  • Data quality gates - Validate/reject files before loading
  • Real-time enrichment - Call APIs during ingestion
  • Conditional routing - Route to different tables based on content
  • Event context - Capture S3 metadata (who uploaded, when, from where)
  • Zero credential management - WIDF handles auth

Add Dynamic Tables for Instant Analytics

Lambda + WIDF gets data in. Dynamic Tables turn it into insights - automatically. Define your transformation query once; Snowflake handles refresh scheduling, dependency tracking, and incremental processing.

flowchart LR
    S3[๐Ÿ“ฆ S3] -->|trigger| Lambda[โšก Lambda + WIDF]
    Lambda -->|push| RAW[RAW_DATA]
    RAW -->|auto refresh| DT[DAILY_SUMMARY<br/>Dynamic Table]
Loading

The setup script creates DAILY_SUMMARY that auto-refreshes as data lands:

Question Answer
How many files landed today? files_loaded
What's our event volume? total_events by action
Daily purchase total? total_amount

Note

External Stage is great for bulk ingestion. Lambda + WIDF + Dynamic Tables delivers smart pipelines with built-in analytics.

How It Works

When a JSON file lands in S3, Lambda loads it into Snowflake using WIDF for authentication. The Lambda's IAM role is trusted directly by Snowflake - no credentials to manage or rotate.

sequenceDiagram
    participant S3 as ๐Ÿ“ฆ S3 Bucket
    participant Lambda as โšก AWS Lambda
    participant SF as โ„๏ธ Snowflake
    participant Table as ๐Ÿ“Š RAW_DATA

    Note over Lambda, SF: No Secrets (WIDF)
    S3->>Lambda: 1. S3 Event (new JSON file)
    Lambda->>Lambda: 2. Get IAM Role Credentials
    Lambda->>SF: 3. Connect with WORKLOAD_IDENTITY
    Note right of Lambda: authenticator: WORKLOAD_IDENTITY
    SF->>SF: 4. Validate IAM Role ARN
    SF->>Lambda: 5. Return Session
    Lambda->>S3: 6. Read JSON Data
    S3-->>Lambda: JSON Records
    Lambda->>Table: 7. write_pandas()
    Table-->>Lambda: Success
Loading

๐Ÿ“‹ Prerequisites

Snowflake

  1. Snowflake Account - Sign up for a free trial

  2. Snowflake CLI - Install and configure connection

Tip

Use OAuth for local applications for passwordless Snowflake CLI authentication. Set authenticator = "OAUTH_AUTHORIZATION_CODE" in your connection config.

Tools

Tool Purpose Installation
Docker Container builds Install
AWS CLI AWS operations Install
AWS SAM CLI Lambda deployment Install
Task Task runner Install

Python Environment (choose one)

Option Tool Installation
A uv (recommended) Install
B Python venv Built-in with Python 3.11+

Using venv instead of uv:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -r requirements.txt

Tip

Run task to see all available commands.

๐Ÿš€ Setup

1. Configure Environment

cp env.example .env

Edit .env with your AWS and Snowflake account details.

2. Verify Configuration

task default

3. Check Prerequisites

# Verify AWS CLI connectivity
task aws:check

# Verify SAM CLI installed
task aws:check-sam

๐ŸŽฌ Demo Walkthrough

Quick Demo (One Command Each)

# Part I: Deploy without WIDF โ†’ Watch it fail
task demo:fail

# Part II: Configure WIDF โ†’ Watch it succeed  
task demo:success

# Cleanup when done
task demo:clean

Part I: Deploy Without WIDF โ†’ Watch It Fail โŒ

1. Setup Snowflake Database

task snow:setup

Creates database, schema, role, and RAW_DATA table.

2. Deploy AWS Resources

task aws:deploy

Deploys Lambda function, S3 bucket with trigger, and IAM role.

3. Get the Lambda Role ARN

task aws:role-arn

Important

Save this ARN - you'll need it for Part II.

4. Upload Test Data

task aws:test

5. Watch the Logs

task aws:logs

Expected error: Snowflake doesn't recognize the Lambda's IAM role yet.

Connecting to Snowflake using WORKLOAD_IDENTITY
   Account: your_account
   User: LAMBDA_LOADER_BOT
Found credentials in environment variables.
Lambda failed: 250001 (08001): Failed to connect to DB. 
   Unable to find an associated Snowflake user for the verified 
   AWS Caller Identity with ARN 'arn:aws:sts::123456789012:assumed-role/snowpark-widf-lambda-role-dev/...'

Note

This error confirms WIDF is working on the AWS side - Lambda's IAM identity was verified.
Snowflake just doesn't have a user that trusts this ARN yet.


Part II: Configure WIDF โ†’ Watch It Succeed โœ…

1. Create WIDF Service User

task snow:lambda-wid

This creates a SERVICE user that trusts the Lambda's IAM role:

CREATE USER LAMBDA_LOADER_BOT
  TYPE = SERVICE
  WORKLOAD_IDENTITY = (
    TYPE = AWS
    ARN = 'arn:aws:iam::123456789012:role/snowpark-widf-lambda-role-dev'
  );

2. Upload Test Data Again

task aws:test

3. Watch the Logs

task aws:logs

Expected: Successful connection and data load via WIDF.

4. Verify Data

# Raw landing data
task snow:query

# Real-time analytics (Dynamic Table)
task snow:query-summary

๐Ÿ”‘ The Key

connection_params = {
    "account": "...",
    "authenticator": "WORKLOAD_IDENTITY",   # โ† WIDF keyless auth
    "workload_identity_provider": "AWS",
    # NO password, NO secret key, NO key pair
}

๐Ÿ“– Commands

Demo

Command Description
task demo:fail Part I - deploy without WIDF, show failure
task demo:success Part II - configure WIDF, show success
task demo:clean Cleanup all resources

Snowflake Commands

Command Description
task snow:setup Create DB/schema/table/dynamic table
task snow:lambda-wid Create WIDF service user
task snow:query Query RAW_DATA (landing)
task snow:query-summary Query DAILY_SUMMARY (analytics)
task snow:cleanup Remove Snowflake resources

AWS

Command Description
task aws:check Verify AWS CLI connectivity
task aws:check-sam Check SAM CLI installed
task aws:deploy Deploy Lambda + S3
task aws:test Upload test data
task aws:logs Tail CloudWatch logs
task aws:logs-recent Show recent logs (last 30 min)
task aws:role-arn Get Lambda Role ARN
task aws:debug Debug CloudFormation errors
task aws:clean-stack Delete failed stack
task aws:empty-bucket Empty S3 bucket
task aws:clean Remove AWS resources

Other

Command Description
task default Show configuration
task test Run integration tests
task deploy Full deployment (AWS + WIDF)
task clean:all Cleanup everything

๐Ÿ”ง Troubleshooting

AWS Deployment Errors

Stack stuck in ROLLBACK_COMPLETE:

task aws:clean-stack
task aws:deploy

Note

"s3:PutBucketPublicAccessBlock" denied?
This is typically caused by an organization-level Service Control Policy (SCP) that prevents users from modifying S3 public access settings. Add SAM_S3_BUCKET to .env with an existing bucket you have write access to.

Snowflake WIDF Errors

Verify WIDF user exists:

snow sql -q "SHOW USERS LIKE 'LAMBDA_LOADER_BOT';"

Check WIDF configuration:

snow sql -q "SHOW USER WORKLOAD IDENTITY AUTHENTICATION METHODS FOR USER LAMBDA_LOADER_BOT;"

๐Ÿงน Cleanup

Remove AWS resources only

task aws:clean

Remove Snowflake resources only

task snow:cleanup

Clean everything

task clean:all

๐Ÿ“š References

License

Copyright (c) Kamesh Sampath. All rights reserved. Licensed under the Apache 2.0 license.

kameshsampath/snowpark-widf-lambda-loader-demo | GitHunt