3

I am currently working on a setup which has 6 kafka-brokers, Data is being pushed into my topic from two producers at a rate of about 4000 messages per second, I have 5 Consumers for this topic working as a group. What should be the ideal number of partitions of my kafka topic?

Please feel free to tell me if any change is required in brokers/consumers/producers as well.

TobiSH
  • 2,593
  • 3
  • 18
  • 33

1 Answers1

1

In general more the partitions - more the throughput. However there are other considerations too like the limits of hardware you are running on, whether you are using compression etc. There is a good enough information from Confluent here which provides you insight into rough calculation you can use to arrive at number of partitions.

A rough formula for picking the number of partitions is based on throughput. You measure the throughout that you can achieve on a single partition for production (call it p) and consumption (call it c). Let’s say your target throughput is t. Then you need to have at least max(t/p, t/c) partitions. The per-partition throughput that one can achieve on the producer depends on configurations such as the batching size, compression codec, type of acknowledgement, replication factor, etc.

Moreover for consumer

The consumer throughput is often application dependent since it corresponds to how fast the consumer logic can process each message

So the best way is to measure and benchmark for your own use case

Shailendra
  • 6,814
  • 2
  • 21
  • 32