2

I have Kafka cluster and the log.dirs=/data/kafka is set to the data directory in server.properties. my DATA partition is kept getting full due to these logs which take a big part of it. (talking about binary logs in topic directory like 000000000000000.log) I read in THE DOCUMENTATION about this parameter (log.dirs The directories in which the log data is kept. If not set, the value in log.dir is used)

and I do not fully understand the meaning yet Moreover, can they be deleted, and which retention should be configured? and is it recommended to separate it from the data directory? thanks

NoamiA
  • 395
  • 1
  • 10

1 Answers1

2

Kafka Topic is a logical grouping of one or more Kafka partitions. Each kafka partition is essentially (log) file/s on the disk. So the data you published kafka are stored in these files (logs) only.

log.dirs tells kafka where to create these files. So whenever you have a new partition (by increasing partition on existing topic or by creating a new topic altogether), you would see new file/s in log.dirs.

You should not delete the data from this folder manually. Use log.retention.hours to configure how long should Kafka hold your data.

Rishabh Sharma
  • 489
  • 2
  • 7
  • thanks that help, so I have another question in cluster only the leader of a certain partition has new file/s in log.dirs.? – NoamiA Aug 25 '20 at 07:01
  • Not necessarily. If you have replication setup, the leader as well as followers will have new files in log.dirs. – Rishabh Sharma Aug 25 '20 at 07:04