Skip to content

feat: pigz#254

Open
rrsettgast wants to merge 6 commits into
mainfrom
feature/pigz
Open

feat: pigz#254
rrsettgast wants to merge 6 commits into
mainfrom
feature/pigz

Conversation

@rrsettgast
Copy link
Copy Markdown
Contributor

@rrsettgast rrsettgast commented May 27, 2026

  • enables the use of parallel gzip (pigz) for creating archives.
  • fixes the label checks so that re-runs will check current label status instead of status at the time of the last modification.

wrtobin and others added 3 commits September 29, 2025 12:38
Prefer a tar-to-pigz pipeline when packing integrated test baselines so
gzip compression can use the available CPU cores. Keep the existing
Python gztar path as a fallback when tar or pigz is unavailable, and
preserve the existing .tar.gz archive format.
Copilot AI review requested due to automatic review settings May 27, 2026 01:27
@rrsettgast rrsettgast requested a review from castelletto1 May 27, 2026 01:28
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces an optional fast-path for creating baseline tarballs using external tar + parallel gzip (pigz), while keeping the existing Python shutil.make_archive(..., format="gztar") behavior as a fallback. It also expands the default restart-check exclusion patterns to ignore additional HDF5 paths that are likely unstable across runs.

Changes:

  • Add a tar | pigz pipeline for baseline archive creation, falling back to Python gztar if tools are unavailable or the pipeline fails.
  • Add a helper to determine an appropriate CPU thread count for pigz.
  • Update restart_check.py default exclusion regex list to include dNdX and detJ.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
geos-ats/src/geos/ats/helpers/restart_check.py Extends default HDF5 exclusion patterns used during restart comparisons.
geos-ats/src/geos/ats/baseline_io.py Adds an external tar + pigz archiving path to speed up baseline archive creation with a fallback to Python’s gztar.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +181 to +205
try:
with open( archive_path, 'wb' ) as output:
tar_process = subprocess.Popen( [ tar_bin, '-C', baseline_path, '-cf', '-', '.' ],
stdout=subprocess.PIPE )
if tar_process.stdout is None:
raise RuntimeError( 'failed to capture tar output' )
pigz_process = subprocess.Popen( [ pigz_bin, '-9', '-p', threads ],
stdin=tar_process.stdout,
stdout=output )
tar_process.stdout.close()

pigz_status = pigz_process.wait()
tar_status = tar_process.wait()

if tar_status != 0 or pigz_status != 0:
try:
os.remove( archive_path )
except FileNotFoundError:
pass
raise RuntimeError( f'tar exited with {tar_status}; pigz exited with {pigz_status}' )

except Exception as e:
logger.warning( 'Parallel baseline archive creation failed; using Python gztar archiver' )
logger.warning( repr( e ) )
return False
@jafranc jafranc changed the title Feature/pigz feat: pigz May 28, 2026
@rrsettgast rrsettgast added test-geos-integration Triggers the testing of geosPythonPackages import and integration in GEOS CI force-geos-integration Force triggering of GEOS integration CI even if non req labels Jun 6, 2026
@rrsettgast rrsettgast requested review from bd713 and jafranc June 6, 2026 05:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

force-geos-integration Force triggering of GEOS integration CI even if non req test-geos-integration Triggers the testing of geosPythonPackages import and integration in GEOS CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants