Fixing Huge /var/log/syslog Caused by Debezium Cassandra Connector

When I followed my earlier guide End-to-End Stream: Cassandra 5.0.5 CDC → Kafka 4.0 (SASL/SSL) → Spark 3.5.6 → Delta Lake, everything ran smoothly. But after a few days, I noticed /var/log/syslog had ballooned to tens of gigabytes.

This post explains why it happens, and how to fix it by reconfiguring Debezium’s logging.


Symptoms

  • /var/log/syslog grows to 9–20 GB in just a few days
  • Rotated logs (syslog.1) are even bigger
  • Examining the log shows endless DEBUG lines like:
DEBUG io.debezium.connector.cassandra.CommitLogIdxParser - Polling for completeness...
DEBUG io.debezium.connector.base.ChangeEventQueue - checking for more records...
DEBUG io.debezium.connector.base.ChangeEventQueue - no records available...

By default, Debezium Cassandra’s bundled Logback config logs at DEBUG level. Since it’s started under systemd with java -jar, all output is sent to journald/rsyslog → /var/log/syslog.

The result: log storm.


Why Logrotate Didn’t Save You

logrotate still runs every day, but Debezium can emit gigabytes of logs per day. By the time rotation occurs at midnight, the log file is already enormous.

So this is not a logrotate failure, it’s simply too much noise going into syslog.


Solution: Give Debezium Its Own Logs

The fix is to create a custom Logback configuration for Debezium:

  • Lower log level from DEBUG → INFO/WARN
  • Redirect logs to a dedicated file with its own rotation
  • Keep /var/log/syslog small and useful

  1. Create a custom Logback config

Save to /opt/debezium/conf/logback.xml:

<configuration>
    <!-- File appender with daily rotation -->
    <appender name="FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
        <file>/opt/debezium/logs/debezium.log</file>
        <rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
            <fileNamePattern>/opt/debezium/logs/debezium.%d{yyyy-MM-dd}.log.gz</fileNamePattern>
            <maxHistory>7</maxHistory>
        </rollingPolicy>
        <encoder>
            <pattern>%d{yyyy-MM-dd HH:mm:ss} %-5level %logger{36} - %msg%n</pattern>
        </encoder>
    </appender>

    <!-- Reduce verbosity -->
    <logger name="io.debezium" level="INFO"/>
    <logger name="org.apache.kafka" level="WARN"/>
    <logger name="com.datastax" level="INFO"/>

    <root level="INFO">
        <appender-ref ref="FILE"/>
    </root>
</configuration>

  1. Create the log directory
sudo mkdir -p /opt/debezium/logs
sudo chown cassandra:cassandra /opt/debezium/logs

  1. Update your run script

In /opt/debezium/bin/run.sh, change the final exec line to:

exec "$JAVA_BIN" $HEAP_OPTS "${JVM_OPTS[@]}" \
  -Dlogback.configurationFile=/opt/debezium/conf/logback.xml \
  -jar "$JAR_PATH" "$CONF_PATH"

  1. Restart the systemd service
sudo systemctl daemon-reloa
sudo systemctl restart debezium-cassandra.service

Now Debezium logs are written to /opt/debezium/logs/debezium.log with daily rotation, and /var/log/syslog remains small.


Cleanup

Once logging is redirected, you can safely clear the oversized system logs:

sudo truncate -s 0 /var/log/syslog
sudo truncate -s 0 /var/log/syslog.1

Conclusion

The root cause of the bloated /var/log/syslog wasn’t rsyslog or logrotate, but Debezium Cassandra logging at DEBUG level.

By providing a custom Logback configuration you:

  • Prevent syslog from ballooning
  • Still capture Debezium logs with rotation
  • Reduce verbosity to INFO/WARN

If you followed my Cassandra CDC → Kafka → Spark → Delta Lake guide, apply this fix immediately after deploying Debezium to avoid filling up your disk.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top