When Delta Has More Rows Than Cassandra: ThingsBoard + Cassandra CDC → Debezium → Kafka → Spark → Delta Lake (TTL Gotcha)

Goal of this post
Document the exact debugging path we followed when our Delta table started showing more rows than Cassandra—step by step—until we proved it was caused by per-cell TTL on specific value columns. Only after proving it did we conclude this behavior matches our architecture (warehouse keeps all history).

Stack & context
ThingsBoard CE 4.x on Cassandra (thingsboard.ts_kv_cf) → CDC (Debezium) → Kafka → Spark 4.0 → Delta Lake. For audits we used cqlsh COPY (no Spark Cassandra connector in this env).

Freeze the pipeline (no new writes)

We stopped ThingsBoard first to quiesce writes. With the source idle, Cassandra counts and the CDC tail stay stable while we compare.

Symptom: counts diverge after a few days

Cassandra (LOCAL_QUORUM): ≈ 1,298,381
Delta (Spark): ≈ 1,301,300

Quick sanity check: duplicates only

Are we double-writing rows in Delta?

WITH dup AS (
  SELECT entity_type, entity_id, key, ts, COUNT(*) AS c
  FROM ts_kv_cf
  GROUP BY entity_type, entity_id, key, ts
  HAVING COUNT(*) > 1
)
SELECT
  SUM(c - 1) AS extra_rows_due_to_dups,
  COUNT(*)   AS duplicate_groups
FROM dup;
-- Result in our case: 0 duplicates

Duplicates ruled out. Move on.

Establish ground truth: snapshot Cassandra, load into Spark

Export from Cassandra

-- cqlsh
CONSISTENCY LOCAL_QUORUM;
COPY thingsboard.ts_kv_cf (
  entity_type, entity_id, key, partition, ts, str_v, long_v, dbl_v, bool_v, json_v
) TO '/tmp/ts_kv_cf_full.csv' WITH HEADER=TRUE;

Load snapshot in Spark

-- spark-sql
CREATE OR REPLACE TEMP VIEW cass_full (
  entity_type STRING, entity_id STRING, key STRING, partition BIGINT, ts BIGINT,
  str_v STRING, long_v BIGINT, dbl_v DOUBLE, bool_v BOOLEAN, json_v STRING
)
USING csv
OPTIONS (path '/tmp/ts_kv_cf_full.csv', header 'true', inferSchema 'false');

SELECT COUNT(*) AS cass_rows FROM cass_full;  -- ~1,298,381
SELECT COUNT(*) AS delta_rows FROM ts_kv_cf;   -- ~1,301,300

Diff the two sets (who has what?)

-- Delta-only (present in Delta, missing in Cassandra)
WITH d AS (SELECT entity_type, entity_id, key, ts FROM ts_kv_cf),
     c AS (SELECT entity_type, entity_id, key, ts FROM cass_full)
SELECT COUNT(*) AS delta_only
FROM d LEFT ANTI JOIN c
ON d.entity_type=c.entity_type AND d.entity_id=c.entity_id AND d.key=c.key AND d.ts=c.ts;   -- 2,919

-- Cassandra-only
WITH d AS (...), c AS (...)
SELECT COUNT(*) AS cass_only
FROM c LEFT ANTI JOIN d
ON c.entity_type=d.entity_type AND c.entity_id=d.entity_id AND c.key=d.key AND c.ts=d.ts;   -- 0

All the drift is Delta-only: 2,919 rows.

What exactly are those 2,919 rows?

CREATE OR REPLACE TEMP VIEW delta_only AS
SELECT d.*
FROM ts_kv_cf d
LEFT ANTI JOIN cass_full c
ON d.entity_type=c.entity_type AND d.entity_id=c.entity_id AND d.key=c.key AND d.ts=c.ts;

SELECT key, COUNT(*) AS cnt
FROM delta_only
GROUP BY key
ORDER BY cnt DESC
LIMIT 30;

Result (each exactly 417 rows):

cpuCount        417
totalMemory     417
memoryUsage     417
clusterMode     417
totalDiscSpace  417
discUsage       417
cpuUsage        417

Seven keys × 417 = 2,919 → the gap is entirely ThingsBoard system metrics (not business telemetry).

Verify in Cassandra: do these keys use TTL (and which column has it)?

In ts_kv_cf, a value is stored in exactly one of five columns by type:
long_v (ints), dbl_v (floats), str_v, bool_v, json_v.

TTL is cell-level in Cassandra. TTL(column) returns the remaining seconds for that cell; NULL means no TTL.

-- cqlsh
SELECT ts,
       WRITETIME(long_v) AS wt_us,
       TTL(long_v)       AS ttl_l,
       TTL(dbl_v)        AS ttl_d,
       TTL(str_v)        AS ttl_s,
       TTL(bool_v)       AS ttl_b,
       TTL(json_v)       AS ttl_j,
       toUnixTimestamp(now()) AS now_ms
FROM thingsboard.ts_kv_cf
WHERE key='cpuUsage'
LIMIT 50 ALLOW FILTERING;

What we saw

ttl_l (i.e., TTL(long_v)) had positive values (e.g., 17, 88, 150, …).
ttl_d/ttl_s/ttl_b/ttl_j were NULL.

Conclusion: for cpuUsage, the value is in long_v, and that cell has TTL.

Recover the original TTL (not just remaining)

TTL(column) is remaining seconds. Combine with WRITETIME(column) and now() to estimate original TTL:

orig_ttl_s ≈ TTL(column) + (now_ms/1000 − WRITETIME(column)/1e6)

Example (real row): WRITETIME(long_v)=1755345861005858, TTL(long_v)=17, now_ms=1755950644989 → ≈ 604,801 s ≈ 7 days (7×86,400 = 604,800).

Multiple rows clustered around ~7 days, confirming a consistent per-cell TTL

Confirm table settings & CDC behavior

From DESCRIBE TABLE thingsboard.ts_kv_cf:

PRIMARY KEY ((entity_type, entity_id, key, partition), ts)
default_time_to_live = 0
cdc = true

default_time_to_live = 0 → no table-wide TTL; the TTL we observed is set on write to the value cell (ThingsBoard Save Timeseries node or metadata.TTL).
cdc = true → TTL expiry does not emit CDC delete events. Debezium/Kafka won’t see deletes when a cell ages out.

The “aha”: why Delta > Cassandra

Only after the steps above did we accept the explanation:

ThingsBoard writes those system metrics with per-cell TTL ≈ 7 days.
When TTL expires, Cassandra silently stops returning those cells.
CDC emits no delete on TTL expiry.
Our Spark/Delta sink is append-only—we do not delete from the warehouse.

Therefore Delta keeps historical rows that Cassandra has already expired → over time Delta > Cassandra.

So… is this a problem? Our decision

After proving the mechanics, we concluded: this is not a bug or limitation, it reflects our design:

Operational store (Cassandra): hot data with TTL on selected keys (system metrics) to control footprint.
Data warehouse (Delta): append-only, keeps all history for audits/replay. We don’t delete in Spark/Delta.
CDC reality: TTL expiry doesn’t produce deletes; the lake won’t mirror removals and that’s fine for us.

Leave a Comment Cancel Reply