kcmhub/kafka-to-adls-springboot-example
Spring Boot application that consumes Kafka messages in batches and writes them to Azure Data Lake Storage Gen2 using SAS authentication, generating one ADLS file per poll.
Kafka โ ADLS Gen2 Exporter
This project is a Spring Boot application that consumes Kafka messages in batches and writes them to Azure Data Lake Storage Gen2 using SAS authentication.
Each Kafka poll is aggregated and stored in a new uniquely generated ADLS file, ensuring no overwrite and clean data partitioning.
๐ Features
-
Batch Kafka consumption (
@KafkaListenerwithbatch = true) -
Concatenation of all messages from a single poll
-
ADLS Gen2 write using SAS token
-
One file created per poll (never overwritten)
-
Partitioning by date:
/kafka-export/date=YYYY-MM-DD/messages-<uuid>.log
๐งฉ Technologies
- Spring Boot 3
- Spring Kafka
- Azure Storage File DataLake SDK
- Azure SAS Token Authentication
โ๏ธ Configuration
Set your values in application.yml:
spring:
kafka:
bootstrap-servers: your-kafka:9092
listener:
type: batch
app:
kafka:
topic: my-topic
adls:
account-name: <account>
filesystem: <container>
base-path: kafka-export
sas-token: "<sas-token-without-question-mark>"โถ๏ธ Running the Project
mvn spring-boot:runThe application will:
- Poll messages from Kafka
- Aggregate them
- Create a unique file under the partitioned date directory
- Upload the content to ADLS Gen2 via DFS endpoint
๐ Output Example
kafka-export/
โโโ date=2025-12-14/
โโโ messages-20251214-8ab3e0c6.log
โโโ messages-20251214-c1ff294d.log
โโโ messages-20251214-f92a7e13.log
Each file corresponds to one Kafka poll.
๐ License
MIT