Skip to content

fix: restore WAL from replica source during designated primary promotion#966

Merged
mnencia merged 2 commits into
cloudnative-pg:mainfrom
leonardoce:restore-wal-replica-promotion
Jun 18, 2026
Merged

fix: restore WAL from replica source during designated primary promotion#966
mnencia merged 2 commits into
cloudnative-pg:mainfrom
leonardoce:restore-wal-replica-promotion

Conversation

@leonardoce

Copy link
Copy Markdown
Contributor

During a replica cluster failover, the designated primary could incorrectly attempt to restore WALs from its own object store instead of the replica source, causing recovery to fail. This happened because the previous logic relied on IsReplica() returning true, but that flag can already be false while PostgreSQL is still in recovery and needs WALs from the source cluster.

@leonardoce leonardoce requested a review from a team as a code owner June 16, 2026 15:35
@dosubot dosubot Bot added size:S This PR changes 10-29 lines, ignoring generated files. bug Something isn't working go Pull requests that update go code size:M This PR changes 30-99 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Jun 16, 2026
@dosubot dosubot Bot added the lgtm This PR has been approved by a maintainer label Jun 18, 2026
@mnencia mnencia force-pushed the restore-wal-replica-promotion branch from fd6b393 to ae60cb1 Compare June 18, 2026 15:15
leonardoce and others added 2 commits June 18, 2026 17:19
During a replica cluster failover, the designated primary could incorrectly
attempt to restore WALs from its own object store instead of the replica
source, causing recovery to fail. This happened because the previous logic
relied on IsReplica() returning true, but that flag can already be false
while PostgreSQL is still in recovery and needs WALs from the source cluster.

Signed-off-by: Leonardo Cecchi <leonardo.cecchi@enterprisedb.com>
Extract the restore_command object-store selection out of Restore into a
pure resolveRestoreObjectStore helper and add table-driven coverage for
the routing decision, including the designated-primary promotion case
(both switchover and failover) that previously had no regression test.

The refactor is behavior-preserving; it also makes the gocognit
suppression on Restore unnecessary, so it is dropped.

Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
@mnencia mnencia force-pushed the restore-wal-replica-promotion branch from ae60cb1 to c62a20f Compare June 18, 2026 15:31
@mnencia mnencia merged commit c34b232 into cloudnative-pg:main Jun 18, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working go Pull requests that update go code lgtm This PR has been approved by a maintainer size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants