2

I have planned to used Amazon MSK and i want to dump consumer logs to S3 . But i don't see any options. Do i need to write my own consumer or is there a way to consume Amazon MSK consumer output to s3 directly ?

Robin Moffatt
  • 22,478
  • 3
  • 38
  • 64
Navin Kumar
  • 150
  • 1
  • 9

2 Answers2

3

There is not a direct way to do it from MSK. You can use an external consumer to do it or preferably use KafkaConnect in an EC2 within the same VPC as MSK.

Either way you need to consider for high availability and data transfer costs. For HA, use consumers in different AZs. For costs, use MSK 2.4.1 that allows consumers to fetch data from the closest replica.

herbertgoto
  • 309
  • 1
  • 5
3

Kafka Connect is generally the best (easiest/scalable/portable/resilient) way to get data between Kafka and systems down (and up) stream such as S3. Learn more about Kafka Connect here and in this talk here.

Since MSK doesn't provide Kafka Connect one option you have is to run your own Kafka Connect worker (which connects to MSK) and use the S3 sink connector (tutorial).

It's worth being aware that other cloud providers offer complete solutions that include not only managed Apache Kafka but also managed Kafka Connect, such as Confluent Cloud, as shown in this blog here.

managed connectors for Kafka Connect in the cloud


Disclaimer: I work for Confluent :-)

Robin Moffatt
  • 22,478
  • 3
  • 38
  • 64