Enable persistent cache for incremental dataset caching#1348
Enable persistent cache for incremental dataset caching#1348BitcrushedHeart wants to merge 1 commit intoNerogar:masterfrom
Conversation
Pass persistent_key_in_name='image_path' to DiskCache constructors so that mgds can build stable file-to-cache mappings. This enables incremental caching: only new or modified files are re-cached instead of the entire dataset. Depends on: Nerogar/mgds#44 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
this will conflict with #1134 |
|
I reccomend closing, lets not open drafts of draft PR's especially not ones that conflict Speak with Maed to understand what you need and if it already exists, if doesnt trying to come to a compromise that suits both and we can commit to the original TLDR: Discuss before submitting PR's as our contribution guide asks |
If person A submits a PR and person B wants to make changes to that PR, they normally can submit a PR to the branch in person A's repository and discuss it there. That's not possible in this case, because this is an addition in OneTrainer for a PR that's in mgds. |
Once/if 1134 is merged, I'll see if this PR is still needed (or if this implementation works better) and close it if not (and update it / reference it specifically if it's still valid). |
Passes
persistent_key_in_name='image_path'to allDiskCacheconstructors so that mgds can track which cache files belong to which images. Without this, changing even a single image in your dataset causes the entire cache to be rebuilt from scratch.With this change, only new or modified files get re-cached.
Based on Nerogar/mgds#44 by @maedtb.
Files changed:
modules/dataLoader/mixin/DataLoaderText2ImageMixin.py- both image and text DiskCache callsmodules/dataLoader/StableDiffusionFineTuneVaeDataLoader.py- VAE fine-tune DiskCache call