Skip to content

Spock 382: Add per-subscription conflict statistics (PG18+)#356

Merged
mason-sharp merged 7 commits intomainfrom
spock-382
Mar 24, 2026
Merged

Spock 382: Add per-subscription conflict statistics (PG18+)#356
mason-sharp merged 7 commits intomainfrom
spock-382

Conversation

@danolivo
Copy link
Contributor

@danolivo danolivo commented Feb 20, 2026

Add per-subscription conflict statistics for Spock using the PG18 custom pgstat kind infrastructure.

  • Register a custom pgstat kind (SPOCK_PGSTAT_KIND_LRCONFLICTS) via pgstat_register_kind() to track replication conflict counts per subscription, keyed by MyDatabaseId since spock IDs are database-scoped.
  • Track all 7 SpockConflictType values: INSERT_EXISTS, UPDATE_ORIGIN_DIFFERS, UPDATE_EXISTS, UPDATE_MISSING, DELETE_ORIGIN_DIFFERS, DELETE_MISSING, and the Spock-specific DELETE_EXISTS. A spock_conflict_stat_index() mapping handles the non-contiguous enum value of SPOCK_CT_DELETE_EXISTS (101).
  • Report conflicts from both spock_report_conflict() (covers INSERT_EXISTS, UPDATE_EXISTS/ORIGIN_DIFFERS, DELETE_ORIGIN_DIFFERS, DELETE_MISSING, DELETE_EXISTS) and directly from spock_apply_heap.c for UPDATE_MISSING (which bypasses the resolution path).
  • Expose stats via spock.get_subscription_stats(subid) and spock.reset_subscription_stats(subid) SQL functions, with extension upgrade support in spock--5.0.6--6.0.0-devel.sql.
  • Subscription create/drop hooks call spock_stat_create_subscription() / spock_stat_drop_subscription() for transactional stats lifecycle.
  • Add conflict_stat regression test (PG18-only, gated in Makefile) covering UPDATE_MISSING, INSERT_EXISTS, DELETE_MISSING counter verification and reset.
  • Add stat counter checks to existing tuple_origin test covering UPDATE_MISSING, DELETE_MISSING, DELETE_EXISTS, and INSERT_EXISTS, gated behind \if version check with tuple_origin_1.out alternative output for PG < 18.
  • Fix replication_set regression test for deterministic output.
  • Improve CI workflow with cluster startup failure diagnostics.

@danolivo danolivo self-assigned this Feb 20, 2026
@danolivo danolivo added the feature New feature label Feb 20, 2026
@danolivo danolivo requested a review from mason-sharp February 20, 2026 09:45
@coderabbitai
Copy link

coderabbitai bot commented Feb 20, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds pgstat-backed per-subscription Spock conflict statistics (PostgreSQL 18+): new header and C implementation, SQL functions to fetch/reset stats, integration points in apply/conflict/create/drop flows, regression tests gated for PG18, Makefile conditional test inclusion, and CI startup diagnostics.

Changes

Cohort / File(s) Summary
Build Configuration
Makefile
Introduce REGRESS_PG18 (set to conflict_stat when PGVER >= 18) and append it to REGRESS so the conflict_stat regression runs only on PG18+.
Conflict Types & Header
include/spock_conflict.h
Make SPOCK_CT_INSERT_EXISTS = 0, add SPOCK_CONFLICT_NUM_TYPES = 7, and add static inline spock_conflict_stat_index(...) to map enum values to contiguous counter indices.
New Public Header
include/spock_conflict_stat.h
Add PG18+ guarded declarations: Spock_Stat_StatSubEntry, Spock_Stat_PendingSubEntry, and exported APIs (spock_stat_register_conflict_stat, spock_stat_report_subscription_conflict, spock_stat_create_subscription, spock_stat_drop_subscription, spock_stat_fetch_stat_subscription).
Core Implementation (PG18+)
src/spock_conflict_stat.c
New implementation registering a custom pgstat kind, shared/pending per-subscription counters, flush/reset callbacks, lifecycle APIs, and SQL-callable functions (spock_get_subscription_stats, spock_reset_subscription_stats).
Integration Points
src/spock.c, src/spock_functions.c, src/spock_apply_heap.c, src/spock_conflict.c
Add PG18+ guarded includes and calls: register stats in _PG_init(), create/drop subscription stats in lifecycle functions, and report conflicts from apply/conflict paths (gated by PG_VERSION_NUM >= 180000).
SQL Function Declarations
sql/spock--6.0.0-devel.sql, sql/spock--5.0.6--6.0.0-devel.sql
Add spock.get_subscription_stats(...) RETURNS record and spock.reset_subscription_stats(subid oid DEFAULT NULL) RETURNS void (C-backed; PG18+ implemented, pre-PG18 call sites return FEATURE_NOT_SUPPORTED).
Regression Tests
tests/regress/sql/conflict_stat.sql, tests/regress/sql/replication_set.sql, tests/regress/sql/tuple_origin.sql
Add conflict_stat.sql exercising multiple conflict counters; update replication_set.sql ordering/exception-log queries and initial truncation; extend tuple_origin.sql to conditionally assert counters when PG supports conflict stats.
CI Workflow
.github/workflows/installcheck.yml
Make cluster startup step tolerant (id: start_cluster, continue-on-error: true) and add a conditional diagnostic step to collect container status/logs on failure.

Poem

🐇 I hopped to the stats with a twitch of my nose,
Counters now tally where each conflict goes.
PG18 lent me ledger and a neat little chart,
Subscriptions get counted, each tear and each start.
Thump-thump—metrics saved close to my heart.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 76.47% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly identifies the main change: adding per-subscription conflict statistics for PostgreSQL 18+, which aligns with the substantive changeset.
Description check ✅ Passed The PR description clearly describes the main feature (per-subscription conflict statistics for Spock), implementation approach (custom pgstat kind for PG18+), key components, SQL API functions, test coverage, and related fixes, all directly aligned with the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch spock-382

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/spock_apply_heap.c (1)

65-68: Add version guard for consistency with other files.
Three other files guard this include with #if PG_VERSION_NUM >= 180000, while src/spock_apply_heap.c does not. Since the header is internally guarded, this is not a build safety issue, but aligning the include guard here would maintain consistency.

🔧 Suggested guard
-#include "spock_conflict_stat.h"
+#if PG_VERSION_NUM >= 180000
+#include "spock_conflict_stat.h"
+#endif
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/spock_apply_heap.c` around lines 65 - 68, Wrap the inclusion of the
Postgres-18-only header in spock_apply_heap.c with the same version guard used
elsewhere: add an `#if` PG_VERSION_NUM >= 180000 ... `#endif` around the `#include`
"spock_conflict_stat.h" (and any other includes that are PG‑18+ in this file) so
the file matches the pattern used in the other sources; keep the guard
conditional on PG_VERSION_NUM and leave other includes (e.g., spock_common.h,
spock_conflict.h, spock_executor.h) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/spock_conflict_stat.c`:
- Around line 92-104: The function spock_stat_report_subscription_conflict
currently only uses Assert to validate the SpockConflictType `type`, which can
lead to an out-of-bounds write to `pending->conflict_count[type]` in non-debug
builds; add a runtime guard that checks `if (type < 0 || type >=
SPOCK_CONFLICT_NUM_TYPES)` and handle it safely (e.g., log a warning via elog or
process logger and return early or map unsupported types like
SPOCK_CT_DELETE_LATE to a safe bucket) before performing
`pending->conflict_count[type]++` to prevent buffer overflow.

---

Nitpick comments:
In `@src/spock_apply_heap.c`:
- Around line 65-68: Wrap the inclusion of the Postgres-18-only header in
spock_apply_heap.c with the same version guard used elsewhere: add an `#if`
PG_VERSION_NUM >= 180000 ... `#endif` around the `#include` "spock_conflict_stat.h"
(and any other includes that are PG‑18+ in this file) so the file matches the
pattern used in the other sources; keep the guard conditional on PG_VERSION_NUM
and leave other includes (e.g., spock_common.h, spock_conflict.h,
spock_executor.h) unchanged.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/spock_conflict_stat.c`:
- Around line 113-124: spock_stat_create_subscription currently calls
pgstat_get_entry_ref(..., create=true) and discards the PgStat_EntryRef*,
leaking the refcount; capture the return value from pgstat_get_entry_ref (assign
to a PgStat_EntryRef *ref), call pgstat_reset_entry (as is), then call
pgstat_release_entry_ref(ref) to release the reference; keep all existing
identifiers (SPOCK_PGSTAT_KIND_LRCONFLICTS, MyDatabaseId, subid) and ensure the
release happens in the same function before returning.
- Around line 27-31: SPOCK_PGSTAT_KIND_LRCONFLICTS currently uses value 27 which
conflicts with pg_stat_plans; change the defined constant in
src/spock_conflict_stat.c to a safe temporary placeholder
PGSTAT_KIND_EXPERIMENTAL (24) for development, and for production pick and
document an unused ID from the allowed range (25,26,28–32) and update the macro
definition accordingly; after choosing a production ID, reserve it on the
PostgreSQL custom cumulative statistics registry per the project guidelines.

---

Duplicate comments:
In `@src/spock_conflict_stat.c`:
- Around line 92-108: spock_stat_report_subscription_conflict currently aborts
the apply worker via ereport(ERROR) when an unexpected SpockConflictType is
seen; change this to a non-fatal warning and early return (use elog(WARNING) or
remove the guard) so the stats path won't kill the replication transaction.
Locate the check in spock_stat_report_subscription_conflict that compares type
!= SPOCK_CT_UPDATE_MISSING and replace the ereport(ERROR, errcode(...)) with
elog(WARNING, ...) plus an immediate return (or simply drop the guard) so only
SPOCK_CT_UPDATE_MISSING proceeds to pgstat_prep_pending_entry and
pending->conflict_count[type]++.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/spock_conflict_stat.c`:
- Around line 27-32: The Kind ID defined as SPOCK_PGSTAT_KIND_LRCONFLICTS (value
28) may be already reserved — verify the PostgreSQL CustomCumulativeStats
registry (per the wiki) to find an unused kind ID, reserve it, and update the
`#define` SPOCK_PGSTAT_KIND_LRCONFLICTS to that verified value; also update or
remove the TODO comment in spock_conflict_stat.c to record the chosen/registered
ID and add a brief note or comment with the registry URL and reservation
confirmation.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
.github/workflows/installcheck.yml (2)

79-82: docker compose cp … /dev/stdout is slightly non-standard.

This works on Linux but relies on /dev/stdout being writable as a file path by the Docker CLI. A more portable and idiomatic alternative that avoids the filesystem-device trick:

♻️ Alternative
-            docker compose cp pgedge-$node:/home/pgedge/logfile.log /dev/stdout 2>/dev/null | tail -80 || echo "(not available)"
+            docker compose exec pgedge-$node cat /home/pgedge/logfile.log 2>/dev/null | tail -80 || echo "(not available)"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/installcheck.yml around lines 79 - 82, Replace the
non-portable "docker compose cp pgedge-$node:/home/pgedge/logfile.log
/dev/stdout" pattern with a portable exec+cat approach: run "docker compose exec
-T pgedge-$node cat /home/pgedge/logfile.log | tail -80" (or use "docker cp
pgedge-$node:/home/pgedge/logfile.log - | tail -80" if you prefer docker cp) so
the logfile is streamed to stdout without relying on /dev/stdout; update the
loop where "docker compose cp pgedge-$node:/home/pgedge/logfile.log /dev/stdout
2>/dev/null | tail -80 || echo \"(not available)\"" appears to use the exec/cat
(or docker cp -) form and preserve the error fallback.

72-74: docker compose ps -a missing || true could truncate diagnostic output.

GitHub Actions uses bash -eo pipefail by default. If docker compose ps -a unexpectedly returns non-zero, the script exits before collecting any per-node logs — the most useful part of the diagnostic.

🛡️ Suggested fix
-          docker compose ps -a
+          docker compose ps -a || true
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/installcheck.yml around lines 72 - 74, The docker compose
status command in the workflow step ("docker compose ps -a") can return non‑zero
and abort the job due to bash -eo pipefail; change the command to ignore
failures (append a tolerant operator such as "|| true") so the step never fails
and subsequent diagnostics (per-node log collection) still run; update the line
containing "docker compose ps -a" in the installcheck workflow to append the
failure-tolerant suffix.
src/spock_apply_heap.c (1)

1073-1082: The comment wording "successful pathway" is misleading in this error branch.

elog(ERROR) is unconditional immediately after this block, so "successful" is confusing. The actual intent — only count when the error is not caught by a try block (first-occurrence path) — is explained in the body comment but the subject phrase could be clearer.

🔧 Suggested comment rewording
-#if PG_VERSION_NUM >= 180000
-		if (!MyApplyWorker->use_try_block)
-			/*
-			 * To avoid duplicate messages, only report the conflict on the
-			 * successful pathway.  We skip counting when the update logic has
-			 * already failed because the conflict would be misleading.
-			 */
-			spock_stat_report_subscription_conflict(MyApplyWorker->subid,
-													SPOCK_CT_UPDATE_MISSING);
-#endif
+#if PG_VERSION_NUM >= 180000
+		if (!MyApplyWorker->use_try_block)
+		{
+			/*
+			 * Report the conflict only on the first-occurrence (non-try-block)
+			 * path to avoid counting the same event twice when a try block
+			 * catches the error and retries with use_try_block=true.
+			 */
+			spock_stat_report_subscription_conflict(MyApplyWorker->subid,
+													SPOCK_CT_UPDATE_MISSING);
+		}
+#endif
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/spock_apply_heap.c` around lines 1073 - 1082, The comment "successful
pathway" is misleading because elog(ERROR) is unconditional; update the comment
above the spock_stat_report_subscription_conflict call to clarify that we only
increment the conflict counter when the error will not be caught by a
surrounding try block (i.e., when MyApplyWorker->use_try_block is false), so
this counts only the first-occurrence / uncaught error path before the
unconditional elog(ERROR) is invoked; reference MyApplyWorker->use_try_block,
spock_stat_report_subscription_conflict(..., SPOCK_CT_UPDATE_MISSING) and the
subsequent elog(ERROR) in the revised wording.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/spock_conflict_stat.c`:
- Around line 128-147: The new call to
pgstat_get_entry_ref(SPOCK_PGSTAT_KIND_LRCONFLICTS, MyDatabaseId, subid, true,
NULL) increments the shared entry refcount but the function never calls
pgstat_release_entry_ref(ref, ...) before returning, leaking the ref and
preventing reclamation; fix by calling pgstat_release_entry_ref(ref,
/*appropriate flags*/ ) after the diagnostic check and before calling
pgstat_reset_entry (i.e., release the ref obtained in ref prior to returning
from this function), ensuring the refcount is decremented so
pgstat_drop_transactional can eventually free the shared entry.
- Around line 27-33: Replace the hard-coded unregistered kind ID by using the
PostgreSQL experimental constant: change the macro definition
SPOCK_PGSTAT_KIND_LRCONFLICTS from the literal 28 to PGSTAT_KIND_EXPERIMENTAL
(i.e., use PGSTAT_KIND_EXPERIMENTAL in spock_conflict_stat.c where
SPOCK_PGSTAT_KIND_LRCONFLICTS is defined) so the extension uses the reserved
experimental slot until you reserve a permanent ID on the CustomCumulativeStats
wiki.

---

Nitpick comments:
In @.github/workflows/installcheck.yml:
- Around line 79-82: Replace the non-portable "docker compose cp
pgedge-$node:/home/pgedge/logfile.log /dev/stdout" pattern with a portable
exec+cat approach: run "docker compose exec -T pgedge-$node cat
/home/pgedge/logfile.log | tail -80" (or use "docker cp
pgedge-$node:/home/pgedge/logfile.log - | tail -80" if you prefer docker cp) so
the logfile is streamed to stdout without relying on /dev/stdout; update the
loop where "docker compose cp pgedge-$node:/home/pgedge/logfile.log /dev/stdout
2>/dev/null | tail -80 || echo \"(not available)\"" appears to use the exec/cat
(or docker cp -) form and preserve the error fallback.
- Around line 72-74: The docker compose status command in the workflow step
("docker compose ps -a") can return non‑zero and abort the job due to bash -eo
pipefail; change the command to ignore failures (append a tolerant operator such
as "|| true") so the step never fails and subsequent diagnostics (per-node log
collection) still run; update the line containing "docker compose ps -a" in the
installcheck workflow to append the failure-tolerant suffix.

In `@src/spock_apply_heap.c`:
- Around line 1073-1082: The comment "successful pathway" is misleading because
elog(ERROR) is unconditional; update the comment above the
spock_stat_report_subscription_conflict call to clarify that we only increment
the conflict counter when the error will not be caught by a surrounding try
block (i.e., when MyApplyWorker->use_try_block is false), so this counts only
the first-occurrence / uncaught error path before the unconditional elog(ERROR)
is invoked; reference MyApplyWorker->use_try_block,
spock_stat_report_subscription_conflict(..., SPOCK_CT_UPDATE_MISSING) and the
subsequent elog(ERROR) in the revised wording.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
src/spock_conflict_stat.c (1)

29-32: Use PGSTAT_KIND_EXPERIMENTAL as a development placeholder; register a production ID on the PostgreSQL wiki before release.

The current value (28) is an unregistered custom stats kind ID that hasn't been reserved on the PostgreSQL wiki's CustomCumulativeStats registry. The previous value (27) was changed due to a confirmed conflict with pg_stat_plans. Per PostgreSQL best practices, development code should use PGSTAT_KIND_EXPERIMENTAL (24 in PostgreSQL 18+) as a temporary placeholder. Only when approaching a production release should you select an unused ID from the available range, document it in the code, and register it on the wiki to prevent collisions with other extensions in the ecosystem.

🔧 Suggested interim change
-#define SPOCK_PGSTAT_KIND_LRCONFLICTS	28
+/* Development placeholder — switch to a registered ID before production release */
+#define SPOCK_PGSTAT_KIND_LRCONFLICTS	PGSTAT_KIND_EXPERIMENTAL
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/spock_conflict_stat.c` around lines 29 - 32, Replace the hard-coded
custom stats ID 28 used in the SPOCK_PGSTAT_KIND_LRCONFLICTS macro with the
development placeholder constant PGSTAT_KIND_EXPERIMENTAL; locate the definition
of SPOCK_PGSTAT_KIND_LRCONFLICTS in src/spock_conflict_stat.c and change its
value to PGSTAT_KIND_EXPERIMENTAL, and add a brief comment reminding to pick and
register a permanent unused ID on the PostgreSQL CustomCumulativeStats wiki
before production release.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/spock_conflict_stat.c`:
- Around line 291-295: spock_get_subscription_stats currently returns
PG_RETURN_VOID() for PG < 18, but the SQL signature expects a composite/record
so returning void is incorrect and can crash callers; replace the
PG_RETURN_VOID() with an error that clearly states the function is unsupported
on PostgreSQL versions < 18 by calling PostgreSQL's error reporting API (e.g.,
ereport/errmsg) inside spock_get_subscription_stats to raise an ERROR with a
message like "spock_get_subscription_stats is not supported on PostgreSQL
versions before 18"; ensure you reference the function name
spock_get_subscription_stats in the message so the failure is clear.
- Around line 99-107: The guard that reports unexpected conflict types uses
ereport(FATAL, ...) which will crash the apply worker and cause an infinite
restart loop; change that to ereport(ERROR, ...) so the error aborts only the
current transaction and keeps the worker alive. Locate the check comparing type
against SPOCK_CT_UPDATE_MISSING in spock_conflict_stat.c (the branch that
currently calls ereport(FATAL, (errcode(...), errmsg("unexpected conflict type
%d reported for subscription %u", type, subid)))); replace the FATAL error level
with ERROR and keep the same errcode/errmsg payload so behavior and diagnostics
remain unchanged.

---

Duplicate comments:
In `@src/spock_conflict_stat.c`:
- Around line 128-146: The code calls
pgstat_get_entry_ref(SPOCK_PGSTAT_KIND_LRCONFLICTS, MyDatabaseId, subid, true,
NULL) which increments the shared entry refcount but never releases it, leaking
a reference; after doing the diagnostic pg_atomic_read_u32 check and before
calling pgstat_reset_entry (or before returning from the function), call
pgstat_release_entry_ref(ref, false) to drop the temporary ref acquired by
pgstat_get_entry_ref so the refcount returns to the expected value and the
pgstat infrastructure can reclaim the entry later; ensure this release happens
on all control paths that use the ref variable (i.e., immediately after the
warning/diagnostic and before any early exits).

---

Nitpick comments:
In `@src/spock_conflict_stat.c`:
- Around line 29-32: Replace the hard-coded custom stats ID 28 used in the
SPOCK_PGSTAT_KIND_LRCONFLICTS macro with the development placeholder constant
PGSTAT_KIND_EXPERIMENTAL; locate the definition of SPOCK_PGSTAT_KIND_LRCONFLICTS
in src/spock_conflict_stat.c and change its value to PGSTAT_KIND_EXPERIMENTAL,
and add a brief comment reminding to pick and register a permanent unused ID on
the PostgreSQL CustomCumulativeStats wiki before production release.

@danolivo danolivo force-pushed the spock-382 branch 4 times, most recently from c13381a to aed824d Compare February 20, 2026 12:26
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/spock_conflict_stat.c`:
- Around line 30-35: The code currently uses a hard-coded custom stats ID
(SPOCK_PGSTAT_KIND_LRCONFLICTS defined as 28) without wiki registration; change
the temporary value to PGSTAT_KIND_EXPERIMENTAL (24) in the definition of
SPOCK_PGSTAT_KIND_LRCONFLICTS, and update the TODO comment near that symbol to
state "use PGSTAT_KIND_EXPERIMENTAL (24) until ID 28 is registered on the
PostgreSQL CustomCumulativeStats wiki, then switch SPOCK_PGSTAT_KIND_LRCONFLICTS
to 28 before production and remove the TODO."

---

Duplicate comments:
In `@src/spock_conflict_stat.c`:
- Around line 128-147: spock_stat_create_subscription currently calls
pgstat_get_entry_ref(...) and never calls pgstat_release_entry_ref, leaking the
shared ref; after you finish using the ref (you already check
ref->shared_entry->refcount and call pgstat_reset_entry), add a call to
pgstat_release_entry_ref(ref) to drop the extra reference so the entry can be
reclaimed; locate the ref variable returned by pgstat_get_entry_ref and invoke
pgstat_release_entry_ref(ref) (or equivalent release API) before the function
returns.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
src/spock_conflict_stat.c (2)

118-147: pgstat_get_entry_ref return value is never released — refcount leak persists.

The ref obtained on line 128 increments the shared entry's refcount, but the function returns without calling pgstat_release_entry_ref(ref). This was flagged in a prior review. While ref is now captured (previously it was discarded), the release call is still missing — each spock_stat_create_subscription call permanently inflates the refcount by 1.

Proposed fix
 	if (pg_atomic_read_u32(&ref->shared_entry->refcount) != 2)
 		/*
 		 * Should never happen: a new subscription stats entry should have
 		 * exactly two references (the hashtable entry and our own).  A higher
 		 * count means a stale entry from a previous subscription with the same
 		 * OID was not properly cleaned up.
 		 */
 		ereport(WARNING,
 				(errmsg("conflict statistics entry for subscription %u "
 						"already has %u references",
 						subid,
 						pg_atomic_read_u32(&ref->shared_entry->refcount)),
 				 errhint("This may indicate that a previous subscription with "
 						 "the same OID was not fully dropped.")));

 	pgstat_reset_entry(SPOCK_PGSTAT_KIND_LRCONFLICTS, MyDatabaseId, subid, 0);
+	pgstat_release_entry_ref(SPOCK_PGSTAT_KIND_LRCONFLICTS, ref, false);
 }
#!/bin/bash
# Verify pgstat_release_entry_ref signature and whether it takes kind as first arg or just ref
# Also check if there's any auto-release mechanism we might be missing
rg -n --type=c -B2 -A8 'pgstat_release_entry_ref' | head -60
PostgreSQL 18 pgstat_release_entry_ref function signature parameters
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/spock_conflict_stat.c` around lines 118 - 147,
spock_stat_create_subscription is leaking a reference obtained from
pgstat_get_entry_ref (ref) because it never calls pgstat_release_entry_ref;
after performing the refcount check and before exiting the function (after
pgstat_reset_entry call), call pgstat_release_entry_ref(ref) to release the
reference and avoid inflating shared_entry->refcount—ensure you use the proper
pgstat_release_entry_ref signature that accepts the PgStat_EntryRef* returned by
pgstat_get_entry_ref.

30-35: Wiki registration TODO is still open — track it before production release.

ID 28 is currently unassigned and safe for development, but the TODO to register it on the PostgreSQL CustomCumulativeStats wiki should be completed before shipping. This was noted in a prior review and remains outstanding.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/spock_conflict_stat.c` around lines 30 - 35, The TODO about registering
the custom stats kind must be resolved before production: update the comment
around SPOCK_PGSTAT_KIND_LRCONFLICTS to indicate that ID 28 is temporary for
development and add a tracking task/reference (e.g., a Bug/Issue ID or a TODO
with owner and deadline) to complete registration on the PostgreSQL
CustomCumulativeStats wiki; ensure the comment references the symbol
SPOCK_PGSTAT_KIND_LRCONFLICTS and the wiki URL so the release checklist can
verify the registration is completed prior to shipping.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/spock_conflict_stat.c`:
- Around line 118-147: spock_stat_create_subscription is leaking a reference
obtained from pgstat_get_entry_ref (ref) because it never calls
pgstat_release_entry_ref; after performing the refcount check and before exiting
the function (after pgstat_reset_entry call), call pgstat_release_entry_ref(ref)
to release the reference and avoid inflating shared_entry->refcount—ensure you
use the proper pgstat_release_entry_ref signature that accepts the
PgStat_EntryRef* returned by pgstat_get_entry_ref.
- Around line 30-35: The TODO about registering the custom stats kind must be
resolved before production: update the comment around
SPOCK_PGSTAT_KIND_LRCONFLICTS to indicate that ID 28 is temporary for
development and add a tracking task/reference (e.g., a Bug/Issue ID or a TODO
with owner and deadline) to complete registration on the PostgreSQL
CustomCumulativeStats wiki; ensure the comment references the symbol
SPOCK_PGSTAT_KIND_LRCONFLICTS and the wiki URL so the release checklist can
verify the registration is completed prior to shipping.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aed824d and 9a37393.

⛔ Files ignored due to path filters (2)
  • tests/regress/expected/conflict_stat.out is excluded by !**/*.out
  • tests/regress/expected/replication_set.out is excluded by !**/*.out
📒 Files selected for processing (13)
  • .github/workflows/installcheck.yml
  • Makefile
  • include/spock_conflict.h
  • include/spock_conflict_stat.h
  • sql/spock--5.0.4--6.0.0-devel.sql
  • sql/spock--6.0.0-devel.sql
  • src/spock.c
  • src/spock_apply_heap.c
  • src/spock_conflict.c
  • src/spock_conflict_stat.c
  • src/spock_functions.c
  • tests/regress/sql/conflict_stat.sql
  • tests/regress/sql/replication_set.sql
🚧 Files skipped from review as they are similar to previous changes (7)
  • tests/regress/sql/conflict_stat.sql
  • .github/workflows/installcheck.yml
  • src/spock_functions.c
  • sql/spock--6.0.0-devel.sql
  • src/spock.c
  • include/spock_conflict.h
  • tests/regress/sql/replication_set.sql

Register a custom pgstat kind (SPOCK_PGSTAT_KIND_LRCONFLICTS) via the
PG18 pgstat_register_kind() infrastructure to track replication conflict
counts per subscription.  Unlike vanilla PostgreSQL subscription stats,
spock statistics use MyDatabaseId as the key because spock node, origin,
and subscription IDs are unique only within a database.

The implementation introduces spock-specific stat types
(Spock_Stat_StatSubEntry, Spock_Stat_PendingSubEntry) sized to
SPOCK_CONFLICT_NUM_TYPES (6 types, excluding SPOCK_CT_DELETE_LATE which
is not yet tracked).  Column names in the SQL-callable function are kept
in sync with the SpockConflictType enum via designated initializers in
SpockConflictStatColNames[].

Currently only SPOCK_CT_UPDATE_MISSING conflicts are actively reported
(from spock_apply_heap.c).  Reporting from spock_report_conflict() is
left as a TODO until SPOCK_CT_DELETE_LATE is either included in the
stats array or filtered out to prevent overflow.

SQL functions: spock.get_subscription_stats(oid),
               spock.reset_subscription_stats(oid).

Includes a regression test exercising UPDATE_MISSING counting, counter
increment, and stats reset.
Truncate the log on the first subscriber visit so subsequent checks are
not polluted.
Also drop the non-deterministic command_counter column from exception_log
queries and order by (table_schema, table_name, remote_commit_ts) with
COLLATE "C" instead, making the output stable across runs.
Guard spock_conflict_stat.h include and improve conflict type validation
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/spock_conflict_stat.c (1)

30-35: TODO: Register kind ID 28 on PostgreSQL wiki before production release.

The TODO comment correctly notes the need to register the custom statistics kind ID. Verification confirms ID 28 is currently unassigned, but formal registration on the PostgreSQL CustomCumulativeStats wiki should be completed before shipping to production to prevent future conflicts with other extensions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/spock_conflict_stat.c` around lines 30 - 35, The TODO notes that the
custom stats kind ID 28 (symbol SPOCK_PGSTAT_KIND_LRCONFLICTS) must be formally
registered on the PostgreSQL CustomCumulativeStats wiki before production;
update the comment to record that registration is required and then register ID
28 (or choose and document a different unused ID) on
https://wiki.postgresql.org/wiki/CustomCumulativeStats, and after registration
remove or update the TODO and add the chosen/assigned ID and registration
reference next to the `#define` for SPOCK_PGSTAT_KIND_LRCONFLICTS so the code
documents the approved ID.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/regress/sql/replication_set.sql`:
- Line 219: The ORDER BY clause on spock.exception_log uses remote_commit_ts
which is not unique and can yield nondeterministic ordering; update the ORDER BY
to add the tie-breaker columns (e.g., command_counter and, if present,
remote_seqno) after remote_commit_ts so rows with equal remote_commit_ts sort
deterministically — adjust both occurrences of the ORDER BY that currently
reference table_schema, table_name, remote_commit_ts to instead order by
table_schema COLLATE "C", table_name COLLATE "C", remote_commit_ts,
command_counter (and remote_seqno if applicable).

---

Nitpick comments:
In `@src/spock_conflict_stat.c`:
- Around line 30-35: The TODO notes that the custom stats kind ID 28 (symbol
SPOCK_PGSTAT_KIND_LRCONFLICTS) must be formally registered on the PostgreSQL
CustomCumulativeStats wiki before production; update the comment to record that
registration is required and then register ID 28 (or choose and document a
different unused ID) on https://wiki.postgresql.org/wiki/CustomCumulativeStats,
and after registration remove or update the TODO and add the chosen/assigned ID
and registration reference next to the `#define` for SPOCK_PGSTAT_KIND_LRCONFLICTS
so the code documents the approved ID.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ff079981-6906-4a94-911b-c6ebf3e7855a

📥 Commits

Reviewing files that changed from the base of the PR and between 9a37393 and 267119a.

⛔ Files ignored due to path filters (2)
  • tests/regress/expected/conflict_stat.out is excluded by !**/*.out
  • tests/regress/expected/replication_set.out is excluded by !**/*.out
📒 Files selected for processing (13)
  • .github/workflows/installcheck.yml
  • Makefile
  • include/spock_conflict.h
  • include/spock_conflict_stat.h
  • sql/spock--5.0.6--6.0.0-devel.sql
  • sql/spock--6.0.0-devel.sql
  • src/spock.c
  • src/spock_apply_heap.c
  • src/spock_conflict.c
  • src/spock_conflict_stat.c
  • src/spock_functions.c
  • tests/regress/sql/conflict_stat.sql
  • tests/regress/sql/replication_set.sql
✅ Files skipped from review due to trivial changes (3)
  • src/spock_conflict.c
  • src/spock_functions.c
  • src/spock_apply_heap.c
🚧 Files skipped from review as they are similar to previous changes (4)
  • Makefile
  • include/spock_conflict.h
  • .github/workflows/installcheck.yml
  • tests/regress/sql/conflict_stat.sql

Previously spock_stat_report_subscription_conflict() only accepted
SPOCK_CT_UPDATE_MISSING and raised ERROR on any other type.  The
Spock-specific SPOCK_CT_DELETE_EXISTS (enum value 101) could not be
used as a direct array index, which blocked enabling stats reporting
from spock_report_conflict().

Introduce spock_conflict_stat_index() to map the non-contiguous enum
values to contiguous array indices, bump SPOCK_CONFLICT_NUM_TYPES
from 6 to 7, and enable stat reporting for all conflict types
including INSERT_EXISTS, UPDATE_ORIGIN_DIFFERS, UPDATE_EXISTS,
DELETE_ORIGIN_DIFFERS, DELETE_MISSING, and DELETE_EXISTS.

Add confl_delete_exists column to get_subscription_stats() output.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
tests/regress/sql/conflict_stat.sql (2)

80-86: Consider verifying stats_reset timestamp after reset.

The test verifies counters return to zero after reset_subscription_stats, but doesn't check that stats_reset is non-NULL and updated. This would provide additional coverage for the reset timestamp callback.

💡 Optional enhancement
 SELECT spock.reset_subscription_stats(:test_sub_id);
 
-SELECT confl_update_missing,
+SELECT confl_update_missing, stats_reset IS NOT NULL AS reset_recorded,
   confl_insert_exists,confl_update_origin_differs,confl_update_exists,
   confl_delete_origin_differs,confl_delete_missing,confl_delete_exists
 FROM spock.get_subscription_stats(:test_sub_id);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/regress/sql/conflict_stat.sql` around lines 80 - 86, Add an assertion
after calling spock.reset_subscription_stats(:test_sub_id) to fetch
spock.get_subscription_stats(:test_sub_id) and verify the stats_reset column is
non-NULL and updated (different from its prior value); use the existing
test_sub_id and the confl_* columns fetch to locate where to add the check and
ensure you compare the new stats_reset timestamp against the previous value or
assert it's not NULL to confirm the reset timestamp callback ran.

88-91: Consider adding replication sync after cleanup DDL.

The cleanup DROP TABLE is replicated via spock.replicate_ddl but there's no spock.wait_slot_confirm_lsn(NULL, NULL) afterward. If subsequent tests run immediately on the subscriber, they might encounter timing issues.

💡 Optional enhancement
 -- Cleanup
 TRUNCATE spock.exception_log;
 \c :provider_dsn
 SELECT spock.replicate_ddl($$ DROP TABLE public.conflict_stat_test CASCADE; $$);
+SELECT spock.wait_slot_confirm_lsn(NULL, NULL);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/regress/sql/conflict_stat.sql` around lines 88 - 91, After calling
spock.replicate_ddl($$ DROP TABLE public.conflict_stat_test CASCADE; $$) ensure
replication completion by invoking spock.wait_slot_confirm_lsn(NULL, NULL) on
the publisher side before proceeding; add a call to
spock.wait_slot_confirm_lsn(NULL, NULL) immediately after spock.replicate_ddl to
block until the DDL has been confirmed on subscribers so subsequent subscriber
tests won't race with the cleanup DDL.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@include/spock_conflict.h`:
- Around line 101-103: The header contains duplicate extern declarations for
spock_conflict_resolver, spock_conflict_log_level, and spock_save_resolutions;
remove the redundant second set so each of these externs is declared only once
(keep the original declarations and delete the copy-pasted block) to avoid
duplicate symbol declarations in include/spock_conflict.h.

---

Nitpick comments:
In `@tests/regress/sql/conflict_stat.sql`:
- Around line 80-86: Add an assertion after calling
spock.reset_subscription_stats(:test_sub_id) to fetch
spock.get_subscription_stats(:test_sub_id) and verify the stats_reset column is
non-NULL and updated (different from its prior value); use the existing
test_sub_id and the confl_* columns fetch to locate where to add the check and
ensure you compare the new stats_reset timestamp against the previous value or
assert it's not NULL to confirm the reset timestamp callback ran.
- Around line 88-91: After calling spock.replicate_ddl($$ DROP TABLE
public.conflict_stat_test CASCADE; $$) ensure replication completion by invoking
spock.wait_slot_confirm_lsn(NULL, NULL) on the publisher side before proceeding;
add a call to spock.wait_slot_confirm_lsn(NULL, NULL) immediately after
spock.replicate_ddl to block until the DDL has been confirmed on subscribers so
subsequent subscriber tests won't race with the cleanup DDL.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: db8fff27-74b0-4323-a989-82f374d8f445

📥 Commits

Reviewing files that changed from the base of the PR and between 267119a and 4035682.

⛔ Files ignored due to path filters (1)
  • tests/regress/expected/conflict_stat.out is excluded by !**/*.out
📒 Files selected for processing (6)
  • include/spock_conflict.h
  • sql/spock--5.0.6--6.0.0-devel.sql
  • sql/spock--6.0.0-devel.sql
  • src/spock_conflict.c
  • src/spock_conflict_stat.c
  • tests/regress/sql/conflict_stat.sql
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/spock_conflict.c
  • sql/spock--5.0.6--6.0.0-devel.sql

…ISTS

Extend conflict_stat.sql with INSERT_EXISTS (insert duplicate key from
provider when subscriber already has the row) and DELETE_MISSING (delete
on provider when row already gone from subscriber) tests, plus a final
verification that all accumulated counters match expectations.

Add stat counter checks to tuple_origin.sql which already provokes
UPDATE_MISSING, DELETE_MISSING, DELETE_EXISTS, and INSERT_EXISTS
conflicts.  The checks are gated behind a \if version >= 180000 guard
so that PG < 18 runs skip the stat queries gracefully.  Provide
tuple_origin_1.out as the alternative expected output for PG < 18.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/regress/sql/tuple_origin.sql (1)

118-122: Assert the full cumulative stats row for DELETE_EXISTS and INSERT_EXISTS.

These checks only prove the expected bucket incremented. If the reporting path also bumps another counter for the same event, this file still passes. Since the stats are cumulative from the initial reset, selecting the full row here would validate the exact post-scenario totals.

Also applies to: 231-235

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/regress/sql/tuple_origin.sql` around lines 118 - 122, Replace the
single-column checks that only confirm a bucket increment with an assertion that
selects the full cumulative stats row from
spock.get_subscription_stats(:origin_test_sub_id) and compares it to the exact
expected totals; specifically use the returned columns (including
confl_delete_exists and confl_insert_exists) to assert the precise post-scenario
cumulative values so no additional counters can be silently bumped.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/regress/sql/conflict_stat.sql`:
- Around line 80-81: The test never verifies that confl_update_missing was reset
to zero; update the test around the reset_subscription_stats(:test_sub_id) call
to first assert confl_update_missing had reached the expected non-zero value
(e.g., = 2) and then, after calling reset_subscription_stats(:test_sub_id),
query and assert confl_update_missing = 0 for the same test_sub_id; reference
the confl_update_missing column and the reset_subscription_stats(:test_sub_id)
call so the test both proves the counter became non-zero and that the reset
cleared it.

---

Nitpick comments:
In `@tests/regress/sql/tuple_origin.sql`:
- Around line 118-122: Replace the single-column checks that only confirm a
bucket increment with an assertion that selects the full cumulative stats row
from spock.get_subscription_stats(:origin_test_sub_id) and compares it to the
exact expected totals; specifically use the returned columns (including
confl_delete_exists and confl_insert_exists) to assert the precise post-scenario
cumulative values so no additional counters can be silently bumped.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fc677ee1-ff7a-4b43-a9a4-4f1db6c02e33

📥 Commits

Reviewing files that changed from the base of the PR and between 4035682 and 7a7c5f0.

⛔ Files ignored due to path filters (3)
  • tests/regress/expected/conflict_stat.out is excluded by !**/*.out
  • tests/regress/expected/tuple_origin.out is excluded by !**/*.out
  • tests/regress/expected/tuple_origin_1.out is excluded by !**/*.out
📒 Files selected for processing (2)
  • tests/regress/sql/conflict_stat.sql
  • tests/regress/sql/tuple_origin.sql

Copy link
Member

@mason-sharp mason-sharp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one very minor comment for the Makefile, feel free to ignore.

Other than that, Coderabbit has a remaining comment that looks valid.

Simplify PG18-only test gating using filter-out in Makefile.
Replace the REGRESS_PG18 variable and ifeq block with a filter-out on
PG < 18, matching the existing pattern used for add_table.

Also remove duplicate extern declarations in spock_conflict.h (already
declared earlier in the same file).
@mason-sharp mason-sharp dismissed ibrarahmad’s stale review March 24, 2026 16:01

They seem resolved

@mason-sharp mason-sharp merged commit 286ad19 into main Mar 24, 2026
10 checks passed
@mason-sharp mason-sharp deleted the spock-382 branch March 24, 2026 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature New feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants