0
Broker 1:

+-------------------+
|      Topic 1      |
|    Partition 0    |
|                   |
|                   |
|     Topic 2       |
|   Partition 1     |
+-------------------+
Broker 2:

+-------------------+
|      Topic 1      |
|    Partition 2    |
|                   |
|                   |
|     Topic 2       |
|   Partition 0     |
+-------------------+
Broker 3:

+-------------------+
|      Topic 1      |
|    Partition 1    |
|                   |
|      Topic 1      |
|    Partition 2    |
|                   |
+-------------------+

So does Broker 1 Topic 1 Partition 1 contains the same as Broker 3 Topic 1 Partition 1


but Broker 3 Topic 1 Partition 1 contains the different as Broker 3 Topic 1 Partition 2

?

OneCricketeer
  • 126,858
  • 14
  • 92
  • 185
J.J. Beam
  • 1,404
  • 8
  • 24

2 Answers2

3

Replication factor must be specified to create topic. It defines number of copies of a topic in a Kafka cluster.

Each partitions in a topic has a leader and if the replication factor is greater than one, then it has replicas. When a message is sent to a partition firstly it arrives to leader (broker which is the partition leader). Then replicas send fetch request (send fetch requests periodically) to leader to replicate messages. Replicas that has the same messages with leader called in-sync-replicas. These are also candidate to be partition leader in case of failure of the leader broker. (fail-over)

If you set ack=all producer setting, then producer gets acknowledgement when all in-sync replicas have received the record. And also by setting min.insync.replicas to greater than one, you can guarantee that all the records that is acknowledged have at least one replicas in kafka cluster.

So, if two brokers are in-sync-replicas for a topic-partition then they have the same messages, otherwise they don't.

H.Ç.T
  • 2,264
  • 1
  • 9
  • 26
0

The short answer is YES. The same partition is identical on all brokers. Different partitions contain different messages.

However, Kafka is a moving system, so not everything is aligned all the time. It depends on the producer's 'ack' value, network throughput & partitioning, and many other factors.

AdiB
  • 46
  • 2