Skip to content

Conversation

@gerrod3
Copy link
Contributor

@gerrod3 gerrod3 commented Jan 21, 2026

Ok, the last fix was correct for duplicate artifacts across domains, but it didn't solve for duplicate metadata artifacts within a domain. At first this seems impossible, but there is a common scenario where this can occurs. A user uploads package a.1.whl with metadata xyz. They realize the package is missing some files and rebuild with the new files, exact same name and crucially the exact same metadata. They reupload and pulp creates a new package since the entire package has a new sha256 even though the metadata is the same as the old one. Then in our migration we will encounter two "different" packages with the same metadata artifact inside them.

My changes try to fix this by keeping track of the metadata artifacts shad256s and avoiding making duplicates. Since we do the saves in batches I have to do a check first within the batch to make sure there are no dups and then do a second check to make sure there are no dups from previous batches. Also, I'm grouping the packages by domain, so all the batches should be inside the same domain.

Hopefully I didn't screw up the logic anywhere.

fixes: #1071

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Migration 0019 fails with uniqueness violation

1 participant