SPOC-484: TRANSDISCARD/SUB_DISABLE: switch from subtransaction replay to read-only dry-run logging#400
Conversation
…-run replay In TRANSDISCARD and SUB_DISABLE modes, the second (retry) pass replays the transaction as read-only. Each DML handler logs a row to spock.exception_log instead of attempting the operation. The record that originally caused the error carries the real error message; other records get an empty string. The failed record is identified by xact_action_counter (saved as failed_action in SpockExceptionLog), which is unique per DML within a transaction. In DISCARD mode, each DML still runs inside a subtransaction; only the failed ones are logged. Add dry_run_logging regression test covering TRANSDISCARD (absent table), DISCARD (truncated table), and SUB_DISABLE (deleted row), with INSERT_EXISTS conflict on drl_t1 to verify that dry-run modes leave spock.resolutions empty while DISCARD populates it.
📝 WalkthroughWalkthroughAdds a new Changes
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…producer Replace wait_slot_confirm_lsn with sync_event/wait_for_sync_event to eliminate replication race conditions when querying exception_log on the subscriber. Add a TRANSDISCARD sub-test that mixes DML with a replicate_ddl call inside a single failing transaction. This reproduces a duplicate DDL logging bug in handle_queued_ddl detected by CodeRabbit during review: the TRANSDISCARD/SUB_DISABLE branch logs via add_entry_to_exception_log, then falls through to the generic logging block which logs again. Remove test section numbering per project conventions.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/regress/sql/dry_run_logging.sql`:
- Around line 107-110: Update the test comment that currently reads "Expect
entries with duplicated DDL record" to reflect the intended behavior of no
duplicate DDL logging; modify the comment above the SELECT querying
spock.exception_log (the query using table_name, operation, (error_message <>
'') AS has_error, ddl_statement ordered by command_counter) to state that the
test expects NO duplicate DDL entries (i.e., a single DDL record per command).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: fc49989d-5676-4a35-890d-8275efe21d33
⛔ Files ignored due to path filters (1)
tests/regress/expected/dry_run_logging.outis excluded by!**/*.out
📒 Files selected for processing (2)
src/spock_apply.ctests/regress/sql/dry_run_logging.sql
…ISABLE The TRANSDISCARD/SUB_DISABLE branch logged via add_entry_to_exception_log then fell through to the generic logging block which called should_log_exception and logged again, producing duplicate DDL entries. Merge both call sites into a single add_entry_to_exception_log after the if/else branches. The TRANSDISCARD/SUB_DISABLE branch now only extracts the SQL text; actual logging is done in the shared block. Add Assert(my_exception_log_index >= 0) before accessing exception_log_ptr since use_try_block guarantees the index was set in handle_begin. Bug reported by CodeRabbit during review.
There was a problem hiding this comment.
🧹 Nitpick comments (3)
src/spock_apply.c (3)
1287-1308: Consider adding defensive assertion before accessingexception_log_ptr.The code at line 2302 includes
Assert(my_exception_log_index >= 0)before accessingexception_log_ptr, but the DML handlers (INSERT, UPDATE, DELETE) don't have this check. Whilemy_exception_log_indexshould be valid whenuse_try_blockis true (set duringhandle_begin), adding the assertion would provide consistent defensive validation.🛡️ Suggested defensive assertion
if (exception_behaviour == TRANSDISCARD || exception_behaviour == SUB_DISABLE) { + Assert(my_exception_log_index >= 0); + /* * TRANSDISCARD and SUB_DISABLE: skip the DML and log the🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/spock_apply.c` around lines 1287 - 1308, Add a defensive Assert that my_exception_log_index is valid before dereferencing exception_log_ptr in the DML handlers (the INSERT/UPDATE/DELETE branches where exception_log_ptr[...] is accessed and log_insert_exception(...) is called). Specifically, in the block using exception_log_ptr[my_exception_log_index] (and before setting exception_log_ptr[my_exception_log_index].local_tuple or reading .initial_error_message), insert Assert(my_exception_log_index >= 0) to mirror the check used in handle_begin/other code paths so the handlers (where use_try_block is expected) validate the index before use.
1299-1304: Optional: Extract repeated error_msg pattern to helper function.The same logic for computing
error_msgis repeated in four handlers (INSERT, UPDATE, DELETE, SQL):char *error_msg = (xact_action_counter == exception_log_ptr[my_exception_log_index].failed_action && exception_log_ptr[my_exception_log_index].initial_error_message[0]) ? exception_log_ptr[my_exception_log_index].initial_error_message : NULL;This could be extracted to a helper function to improve maintainability and ensure consistent behavior if the logic ever needs to change.
♻️ Suggested helper function
/* * Get the error message to log for the current action during dry-run replay. * Returns the initial error message only for the action that caused the * original failure; returns NULL for all other actions. */ static inline char * get_dry_run_error_msg(void) { Assert(my_exception_log_index >= 0); if (xact_action_counter == exception_log_ptr[my_exception_log_index].failed_action && exception_log_ptr[my_exception_log_index].initial_error_message[0] != '\0') return exception_log_ptr[my_exception_log_index].initial_error_message; return NULL; }Also applies to: 1477-1482, 1594-1599, 2307-2312
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/spock_apply.c` around lines 1299 - 1304, Extract the repeated error_msg computation into a small static helper (e.g. static inline char *get_dry_run_error_msg(void)) that reads my_exception_log_index, xact_action_counter and exception_log_ptr and returns exception_log_ptr[my_exception_log_index].initial_error_message when xact_action_counter == exception_log_ptr[my_exception_log_index].failed_action and the initial_error_message is non-empty, otherwise returns NULL; replace the duplicated ternary in the four handlers (the blocks around the places that set error_msg in INSERT/UPDATE/DELETE/SQL handlers) with a call to get_dry_run_error_msg() and keep the Assert(my_exception_log_index >= 0) in the helper.
871-878: Consider clearinginitial_error_messagefor consistency.Lines 840-843 (TRANSDISCARD path) clear
initial_error_message[0] = '\0', but lines 871-876 (SUB_DISABLE path) do not. While this is harmless since the worker exits afterspock_disable_subscription(), adding the clear would maintain consistency between the two paths.♻️ Suggested change for consistency
exception_log = &exception_log_ptr[my_exception_log_index]; exception_log->commit_lsn = InvalidXLogRecPtr; + exception_log->initial_error_message[0] = '\0'; MySpockWorker->restart_delay = 0;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/spock_apply.c` around lines 871 - 878, In the SUB_DISABLE path add the same clearing of the saved initial error string as done in the TRANSDISCARD path: after assigning exception_log = &exception_log_ptr[my_exception_log_index] and before calling spock_disable_subscription()/elog(ERROR...), set exception_log->initial_error_message[0] = '\0' so the SpockExceptionLog (exception_log) is consistent with the TRANSDISCARD handling; keep the rest (commit_lsn, MySpockWorker->restart_delay, spock_disable_subscription(), elog) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/spock_apply.c`:
- Around line 1287-1308: Add a defensive Assert that my_exception_log_index is
valid before dereferencing exception_log_ptr in the DML handlers (the
INSERT/UPDATE/DELETE branches where exception_log_ptr[...] is accessed and
log_insert_exception(...) is called). Specifically, in the block using
exception_log_ptr[my_exception_log_index] (and before setting
exception_log_ptr[my_exception_log_index].local_tuple or reading
.initial_error_message), insert Assert(my_exception_log_index >= 0) to mirror
the check used in handle_begin/other code paths so the handlers (where
use_try_block is expected) validate the index before use.
- Around line 1299-1304: Extract the repeated error_msg computation into a small
static helper (e.g. static inline char *get_dry_run_error_msg(void)) that reads
my_exception_log_index, xact_action_counter and exception_log_ptr and returns
exception_log_ptr[my_exception_log_index].initial_error_message when
xact_action_counter == exception_log_ptr[my_exception_log_index].failed_action
and the initial_error_message is non-empty, otherwise returns NULL; replace the
duplicated ternary in the four handlers (the blocks around the places that set
error_msg in INSERT/UPDATE/DELETE/SQL handlers) with a call to
get_dry_run_error_msg() and keep the Assert(my_exception_log_index >= 0) in the
helper.
- Around line 871-878: In the SUB_DISABLE path add the same clearing of the
saved initial error string as done in the TRANSDISCARD path: after assigning
exception_log = &exception_log_ptr[my_exception_log_index] and before calling
spock_disable_subscription()/elog(ERROR...), set
exception_log->initial_error_message[0] = '\0' so the SpockExceptionLog
(exception_log) is consistent with the TRANSDISCARD handling; keep the rest
(commit_lsn, MySpockWorker->restart_delay, spock_disable_subscription(), elog)
unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 031139ab-7009-4644-ab84-545c7b0c6bb2
⛔ Files ignored due to path filters (1)
tests/regress/expected/dry_run_logging.outis excluded by!**/*.out
📒 Files selected for processing (2)
src/spock_apply.ctests/regress/sql/dry_run_logging.sql
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/regress/sql/dry_run_logging.sql
spock.exception_loginstead of attempting the operation. Only the record that originally caused the error carries the real error message; other records get an empty string.xact_action_counter(saved asfailed_actioninSpockExceptionLog), which is unique per DML within a transaction.dry_run_loggingregression test covering all three exception modes with different breakage methods (absent table, truncated table, deleted row) and an INSERT_EXISTS conflict to verify that dry-run modes leavespock.resolutionsempty while DISCARD populates it.