When I followed my earlier guide End-to-End Stream: Cassandra 5.0.5 CDC → Kafka 4.0 (SASL/SSL) → Spark 3.5.6 → Delta Lake, everything ran smoothly. But after a few days, I noticed /var/log/syslog
had ballooned to tens of gigabytes.
This post explains why it happens, and how to fix it by reconfiguring Debezium’s logging.
Symptoms
/var/log/syslog
grows to 9–20 GB in just a few days- Rotated logs (
syslog.1
) are even bigger - Examining the log shows endless DEBUG lines like:
DEBUG io.debezium.connector.cassandra.CommitLogIdxParser - Polling for completeness...
DEBUG io.debezium.connector.base.ChangeEventQueue - checking for more records...
DEBUG io.debezium.connector.base.ChangeEventQueue - no records available...
By default, Debezium Cassandra’s bundled Logback config logs at DEBUG level. Since it’s started under systemd with java -jar
, all output is sent to journald/rsyslog → /var/log/syslog
.
The result: log storm.
Why Logrotate Didn’t Save You
logrotate
still runs every day, but Debezium can emit gigabytes of logs per day. By the time rotation occurs at midnight, the log file is already enormous.
So this is not a logrotate failure, it’s simply too much noise going into syslog.
Solution: Give Debezium Its Own Logs
The fix is to create a custom Logback configuration for Debezium:
- Lower log level from DEBUG → INFO/WARN
- Redirect logs to a dedicated file with its own rotation
- Keep
/var/log/syslog
small and useful
- Create a custom Logback config
Save to /opt/debezium/conf/logback.xml
:
<configuration>
<!-- File appender with daily rotation -->
<appender name="FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>/opt/debezium/logs/debezium.log</file>
<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
<fileNamePattern>/opt/debezium/logs/debezium.%d{yyyy-MM-dd}.log.gz</fileNamePattern>
<maxHistory>7</maxHistory>
</rollingPolicy>
<encoder>
<pattern>%d{yyyy-MM-dd HH:mm:ss} %-5level %logger{36} - %msg%n</pattern>
</encoder>
</appender>
<!-- Reduce verbosity -->
<logger name="io.debezium" level="INFO"/>
<logger name="org.apache.kafka" level="WARN"/>
<logger name="com.datastax" level="INFO"/>
<root level="INFO">
<appender-ref ref="FILE"/>
</root>
</configuration>
- Create the log directory
sudo mkdir -p /opt/debezium/logs
sudo chown cassandra:cassandra /opt/debezium/logs
- Update your run script
In /opt/debezium/bin/run.sh
, change the final exec
line to:
exec "$JAVA_BIN" $HEAP_OPTS "${JVM_OPTS[@]}" \
-Dlogback.configurationFile=/opt/debezium/conf/logback.xml \
-jar "$JAR_PATH" "$CONF_PATH"
- Restart the systemd service
sudo systemctl daemon-reloa
sudo systemctl restart debezium-cassandra.service
Now Debezium logs are written to /opt/debezium/logs/debezium.log
with daily rotation, and /var/log/syslog
remains small.
Cleanup
Once logging is redirected, you can safely clear the oversized system logs:
sudo truncate -s 0 /var/log/syslog
sudo truncate -s 0 /var/log/syslog.1
Conclusion
The root cause of the bloated /var/log/syslog
wasn’t rsyslog or logrotate, but Debezium Cassandra logging at DEBUG level.
By providing a custom Logback configuration you:
- Prevent syslog from ballooning
- Still capture Debezium logs with rotation
- Reduce verbosity to INFO/WARN
If you followed my Cassandra CDC → Kafka → Spark → Delta Lake guide, apply this fix immediately after deploying Debezium to avoid filling up your disk.