GitHunt
RY

ryanmcdowell/dataflow-bigquery-dynamic-destinations

An example pipeline for dynamically routing events from Pub/Sub to different BigQuery tables based on a message attribute.

Dataflow BigQuery Dynamic Destinations

Occasionally you may have data of different domains being ingested into a single topic which
requires dynamically routing to the proper output table in BigQuery. One way to accomplish this is
to use a message attribute on the header of the Pub/Sub message to indicate the table which the
message should be routed to. The routing of the messages to the correct table can be accomplished in
a single pipeline by using the BigQueryIO transform's dynamic destination capabilities. In this
pipeline, a SerializableFunction is used to extract the table attribute from the Pub/Sub message
and then subsequently route to the proper table destination.

Pipeline

PubsubToBigQueryDynamicDestinations -
A pipeline which consumes JSON messages from Pub/Sub and outputs records to BigQuery tables using a
Pub/Sub message attribute to determine the proper table to route the message to.

Getting Started

Requirements

  • Java 8
  • Maven 3

Building the Project

Build the entire project using the maven compile command.

mvn clean && mvn compile

Languages

Java100.0%

Contributors

Apache License 2.0
Created August 18, 2018
Updated May 21, 2023