chore(clickhouse): add ttl only drop parts to tables#2371
chore(clickhouse): add ttl only drop parts to tables#2371
Conversation
PR SummaryMedium Risk Overview Reviewed by Cursor Bugbot for commit 8e353e4. Bugbot is set up for automated code reviews on this repo. Configure here. |
There was a problem hiding this comment.
LGTM — the functional SQL is correct and the nits are already captured in inline comments.
Extended reasoning...
Overview
Single new ClickHouse migration file adding ttl_only_drop_parts = 1 to five local MergeTree tables (Up) and reverting (Down). No application logic, no schema changes, no data migrations.
Security risks
None. This is a table-settings-only migration with no user-facing data exposure or auth changes.
Level of scrutiny
Low. The change is mechanical and self-contained: five identical ALTER TABLE statements in each direction. The tables affected are internal metrics/events tables.
Other factors
Two nits were flagged by the bug hunter and are attached as inline comments: a comment that says 'partitions' where it should say 'parts', and a Down migration that uses MODIFY SETTING = 0 rather than RESET SETTING. Neither affects runtime behavior today (ClickHouse default for ttl_only_drop_parts is 0), so the functional correctness of the migration is not in question. The inline comments already communicate these issues clearly.
| -- Enable ttl_only_drop_parts on all local MergeTree tables. | ||
| -- Since all tables use PARTITION BY toDate(timestamp), TTL boundaries align | ||
| -- with partition boundaries. This avoids expensive part rewrites during TTL | ||
| -- cleanup -- ClickHouse will drop entire partitions instead. |
There was a problem hiding this comment.
🟡 The comment on lines 2-5 says "ClickHouse will drop entire partitions instead" but ttl_only_drop_parts = 1 drops entire parts (MergeTree physical data chunks), not partitions. The setting name itself uses "parts" — consider updating the comment to say "ClickHouse will drop entire parts instead" to accurately reflect the mechanism.
Extended reasoning...
What the comment says vs. what the setting does
The migration comment reads: "ClickHouse will drop entire partitions instead." However, ttl_only_drop_parts operates at the part level. In ClickHouse's MergeTree engine, a partition is a logical grouping of data (here, one calendar day via PARTITION BY toDate(timestamp)), while a part is a physical data chunk on disk. Multiple parts can exist within a single partition — especially before background merges have run or during active ingestion. The setting name itself (ttl_only_drop_parts) uses "parts," not "partitions."
How the mechanism works
With ttl_only_drop_parts = 1, ClickHouse checks each part individually: if all rows in that part satisfy the TTL condition, the entire part is dropped atomically without a rewrite. If even one row in a part has not expired, the part is rewritten (with expired rows removed). This is fundamentally different from dropping a whole partition.
Why the refutation partially holds — and why the comment is still imprecise
The refuting verifier argues that because TTL boundaries align with partition boundaries (daily partitions + daily TTL), all rows in every part of an expired partition will be expired, so every part in that partition gets dropped — effectively a partition drop. This is often true in practice for older, fully-merged data. However, it is not guaranteed: during active ingestion or before background merges complete, a single day-partition may contain multiple parts. Rows with the same date are spread across those parts, all rows in each part are still expired, so they still all get dropped — but the mechanism is still part-level, not partition-level. The comment should reflect the actual ClickHouse mechanism rather than the incidental practical outcome.
Concrete proof
Suppose partition 2026-04-01 contains two parts: part_A (10k rows) and part_B (5k rows), both with all rows having timestamp on 2026-04-01 (expired). With ttl_only_drop_parts = 1, ClickHouse drops part_A and part_B independently — two part-level drop operations, not one partition-level drop. The partition ceases to exist only because all its constituent parts were dropped. If part_B had even one non-expired row, only part_A would be dropped and part_B would be rewritten — no concept of "partition drop" applies here at all.
Impact and fix
The actual SQL statements are correct and functional; this is purely a comment inaccuracy with no runtime impact. The fix is a one-word change: replace "partitions" with "parts" in the comment, or rephrase to "ClickHouse will drop entire parts (and thus entire partitions once fully merged) instead of rewriting them."
| ALTER TABLE sandbox_metrics_gauge_local MODIFY SETTING ttl_only_drop_parts = 0; | ||
| ALTER TABLE team_metrics_gauge_local MODIFY SETTING ttl_only_drop_parts = 0; | ||
| ALTER TABLE team_metrics_sum_local MODIFY SETTING ttl_only_drop_parts = 0; | ||
| ALTER TABLE sandbox_events_local MODIFY SETTING ttl_only_drop_parts = 0; | ||
| ALTER TABLE sandbox_host_stats_local MODIFY SETTING ttl_only_drop_parts = 0; |
There was a problem hiding this comment.
🟡 The Down migration uses MODIFY SETTING ttl_only_drop_parts = 0 to roll back, but the semantically correct inverse of MODIFY SETTING is RESET SETTING, which removes the explicit entry from table metadata entirely. After running this Down migration, SHOW CREATE TABLE will display an explicit ttl_only_drop_parts = 0 that was never present before the migration ran, meaning the rollback does not fully restore the original table state — use ALTER TABLE ... RESET SETTING ttl_only_drop_parts instead.
Extended reasoning...
What the bug is and how it manifests
In ClickHouse, table settings can be in two states: explicitly set (stored in table metadata) or unset (inherited from the server default). The Up migration uses MODIFY SETTING ttl_only_drop_parts = 1 to write an explicit value into each table's metadata. The correct inverse is RESET SETTING ttl_only_drop_parts, which removes the stored entry and returns the table to its original 'unset' state. Instead, the Down migration uses MODIFY SETTING ttl_only_drop_parts = 0, which overwrites the explicit 1 with an explicit 0 — a different explicit value, not the absence of a value.
The specific code path that triggers it
Lines 13–17 of 20260413080000_ttl_only_drop_parts.sql run five ALTER TABLE ... MODIFY SETTING ttl_only_drop_parts = 0 statements. Each of those writes an explicit ttl_only_drop_parts = 0 into the respective table's metadata in the ClickHouse system catalog.
Why existing code doesn't prevent it
The migration file is plain SQL with no validation layer. There is no test that compares SHOW CREATE TABLE output before and after a round-trip Up+Down migration, so the metadata discrepancy goes undetected.
What the impact would be
Today the impact is purely cosmetic: the ClickHouse default for ttl_only_drop_parts is 0, so runtime TTL behavior is identical. However, after rolling back, SHOW CREATE TABLE will output SETTINGS ttl_only_drop_parts = 0 for all five tables even though that setting was never explicitly present before. This means the tables are not in their pre-migration state, which could confuse operators and tooling that inspects table metadata. If ClickHouse ever changes the server-level default (unlikely but possible), these tables would not inherit the new default because they have an explicit override pinned to 0.
How to fix it
Replace every MODIFY SETTING ttl_only_drop_parts = 0 in the -- +goose Down block with RESET SETTING ttl_only_drop_parts:
-- +goose Down
ALTER TABLE sandbox_metrics_gauge_local RESET SETTING ttl_only_drop_parts;
ALTER TABLE team_metrics_gauge_local RESET SETTING ttl_only_drop_parts;
ALTER TABLE team_metrics_sum_local RESET SETTING ttl_only_drop_parts;
ALTER TABLE sandbox_events_local RESET SETTING ttl_only_drop_parts;
ALTER TABLE sandbox_host_stats_local RESET SETTING ttl_only_drop_parts;Step-by-step proof
- Before migration:
SHOW CREATE TABLE sandbox_metrics_gauge_local— nottl_only_drop_partsappears in theSETTINGSclause. - Run Up migration:
SHOW CREATE TABLEnow showsSETTINGS ..., ttl_only_drop_parts = 1. - Run Down migration as written (
MODIFY SETTING = 0):SHOW CREATE TABLEshowsSETTINGS ..., ttl_only_drop_parts = 0— an explicit entry that did not exist in step 1. - Run Down migration with
RESET SETTINGinstead:SHOW CREATE TABLEshows nottl_only_drop_partsentry — identical to step 1, correctly restoring the original state.
This should decrease load on clickhouse as the data will be dropped once the whole partition is expired instead of per row