Skip to content

Conversation

@heysky
Copy link

@heysky heysky commented Dec 29, 2025

This commit adds a new incremental backup mode called SUMMARIZE that leverages PostgreSQL 17+'s native WAL summarize feature (summarize_wal GUC) to track modified data blocks without requiring external extensions like ptrack.

Feature Overview

The SUMMARIZE backup mode:

  • Requires PostgreSQL 17+ with summarize_wal=on enabled
  • Uses pg_wal_summary_contents() and pg_available_wal_summaries() functions
  • Builds pagemap bitmaps from WAL summary information for incremental backups
  • Validates WAL summary availability before backup starts

Files Modified

  • src/pg_probackup.h: Added BACKUP_MODE_DIFF_SUMMARIZE enum and function declarations
  • src/catalog.c: Updated backupModes[] array and mode parsing functions
  • src/validate.c: Added SUMMARIZE mode to validation checks
  • src/backup.c: Added WAL summary LSN validation and wait logic
  • src/data.c: Added SUMMARIZE mode to incremental mode checks
  • src/catchup.c: Added WAL summarize support for catchup command
  • src/pg_probackup.c: Added SUMMARIZE to supported backup modes
  • src/help.c: Updated help text to include SUMMARIZE mode
  • Makefile: Added src/walsummary.o to object files list

New File

  • src/walsummary.c: Core implementation of WAL summary integration
    • pg_is_walsummary_enabled(): Check if PG 17+ has summarize_wal enabled
    • get_walsummary_summarized_lsn(): Get current summarized LSN
    • wait_wal_summarization(): Wait for summarizer to catch up to target LSN (60s timeout)
    • make_pagemap_from_walsummary(): Build pagemap from WAL summary data

Key Implementation Details

Function Integration

The make_pagemap_from_walsummary() function:

  1. Queries pg_available_wal_summaries() to find overlapping summary files
  2. For each summary, calls pg_wal_summary_contents() with intersection range
  3. Builds pagemap bitmaps for changed blocks
  4. Matches WAL summary data to pgFile list using (dbOid, tblspcOid, relOid, forkName)

Usage

Create incremental backup with SUMMARIZE mode:
pg_probackup backup -B /backup/dir -b SUMMARIZE --instance=name -D /data/dir

Prerequisites:

  • PostgreSQL 17 or higher
  • summarize_wal=on in postgresql.conf
  • Previous FULL backup exists

Error Handling

If WAL summarizer is disabled:
ERROR: WAL summarize backup mode requires summarize_wal to be enabled

If summarizer doesn't catch up within 60 seconds:
ERROR: WAL summarizer did not catch up to within timeout period. Incremental backup cannot proceed without complete WAL summaries.

…ize feature

This commit adds a new incremental backup mode called SUMMARIZE that leverages
PostgreSQL 17+'s native WAL summarize feature (summarize_wal GUC) to track
modified data blocks without requiring external extensions like ptrack.

## Feature Overview

The SUMMARIZE backup mode:
- Requires PostgreSQL 17+ with summarize_wal=on enabled
- Uses pg_wal_summary_contents() and pg_available_wal_summaries() functions
- Builds pagemap bitmaps from WAL summary information for incremental backups
- Validates WAL summary availability before backup starts

## Files Modified

- src/pg_probackup.h: Added BACKUP_MODE_DIFF_SUMMARIZE enum and function declarations
- src/catalog.c: Updated backupModes[] array and mode parsing functions
- src/validate.c: Added SUMMARIZE mode to validation checks
- src/backup.c: Added WAL summary LSN validation and wait logic
- src/data.c: Added SUMMARIZE mode to incremental mode checks
- src/catchup.c: Added WAL summarize support for catchup command
- src/pg_probackup.c: Added SUMMARIZE to supported backup modes
- src/help.c: Updated help text to include SUMMARIZE mode
- Makefile: Added src/walsummary.o to object files list

## New File

- src/walsummary.c: Core implementation of WAL summary integration
  - pg_is_walsummary_enabled(): Check if PG 17+ has summarize_wal enabled
  - get_walsummary_summarized_lsn(): Get current summarized LSN
  - wait_wal_summarization(): Wait for summarizer to catch up to target LSN (60s timeout)
  - make_pagemap_from_walsummary(): Build pagemap from WAL summary data

## Key Implementation Details

### Function Integration
The make_pagemap_from_walsummary() function:
1. Queries pg_available_wal_summaries() to find overlapping summary files
2. For each summary, calls pg_wal_summary_contents() with intersection range
3. Builds pagemap bitmaps for changed blocks
4. Matches WAL summary data to pgFile list using (dbOid, tblspcOid, relOid, forkName)

## Usage

Create incremental backup with SUMMARIZE mode:
  pg_probackup backup -B /backup/dir -b SUMMARIZE --instance=name -D /data/dir

Prerequisites:
- PostgreSQL 17 or higher
- summarize_wal=on in postgresql.conf
- Previous FULL backup exists

## Error Handling

If WAL summarizer is disabled:
  ERROR: WAL summarize backup mode requires summarize_wal to be enabled

If summarizer doesn't catch up within 60 seconds:
  ERROR: WAL summarizer did not catch up to <LSN> within timeout period.
         Incremental backup cannot proceed without complete WAL summaries.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant