0

Going through Kafka documentation and various other resources, I understand that messages in Kafka are organized into topics. Also, a topic can be broken down into partitions and each partition can be hosted on a different server. This gives redundancy and scalability.

I am not sure what the word 'broken' means here. Does it mean that if messages added to a topic are, say '1 2 3 4 5 6 7', then after breaking it into partitions, we would have one partition having only a subpart of whole topic. Like one partition having '1 2 3' while another partition having '4 5 6' and yet another having just '7'. OR does it mean that every single partition has '1 2 3 4 5 6 7', meaning we have exact replicas.

Mandroid
  • 3,359
  • 3
  • 28
  • 59

1 Answers1

1

a topic can be broken down into partitions and each partition can be hosted on a different server. This gives redundancy and scalability

Above statement refers to - Kafka topics are usually divided into number of partitions. Partitions allows to parallize the topic by splitting the data across different brokers. If a topic contains only one partition, the data resides on single brokers and will be read sequentially. If lets's say number of partition is 3, same data will be splitted into 3 partitions, each carrying different sets of events. You can read the topic in 3 parallel process, each reading from one partition. The more number of partitions you have, the more scalablility you can achieve. Yes, each partition will have only subset of data.

enter image description here

Nishu Tayal
  • 18,079
  • 8
  • 44
  • 90