Retention Policy
Kafka provides different ways to manage how long messages are retained. There are two main types of retention policies:
-
Time-Based Retention (
log.retention.ms
): This policy allows you to specify how long Kafka should retain messages in a topic before they are eligible for deletion. For example, you might configure Kafka to keep messages for 7 days. After this period, the messages are eligible for deletion, even if they haven’t reached the segment size limit. -
Size-Based Retention (
log.retention.bytes
): This policy controls how much total disk space the logs for a topic can use. If the total size of the log segments in a topic exceeds this limit, older segments are deleted to free up space. This ensures that the disk usage doesn’t exceed the configured limit.
Log Cleaner
The log cleaner is a background process in Kafka responsible for cleaning up old messages that are no longer needed based on the configured retention policies. It handles:
- Compaction: In addition to retention based on time or size, Kafka supports log compaction. Log compaction ensures that Kafka retains the latest message for each key in a topic, even if messages are older than the retention period. This is useful for scenarios where you want to keep the latest state of a record rather than all historical messages.