Make artifact path matching match the official runner#126
Conversation
|
Is it broken or do you mean it didn't match gitlab's runner behaviour? If the former can you point to the commit that introduced the brokeness. In the latter case apart from the bits you documented is this change the behaviour of the runner in other ways that might impact users? |
1bfd08f to
7013664
Compare
|
I updated the commit message / PR body. I guess strictly speaking, it's not "broken" in that before we weren't doing any filtering at all, but I'd argue it's still "broken" in that we have a bunch of consumers of the runner (e.g. ci-package-builder) that actually give "correct" values for |
There was a problem hiding this comment.
Pull request overview
This PR fixes artifact path filtering to better match GitLab Runner’s semantics by normalizing paths and matching patterns against ancestor directories with path-separator-aware globbing.
Changes:
- Add
artifact_path_matches()to normalize artifact patterns and match them against successive ancestors of each candidate file path usinggloboptions that prevent wildcards from crossing path components. - Normalize
UploadableFilepaths before both filtering and uploading, aligning behavior withfilepath.Clean-style normalization. - Update artifact-related integration tests to cover normalization (
./test2), directory patterns (dir), and trailing slash handling (test/), and add a small unit test suite for the matcher.
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| gitlab-runner/src/run.rs | Implements normalized, ancestor-aware artifact path matching and applies normalization to uploaded artifact file paths. |
| gitlab-runner/tests/artifacts.rs | Expands tests to validate directory matching and normalization behavior in upload/download flows. |
| gitlab-runner/Cargo.toml | Adds the normalize-path dependency used for lexical path normalization. |
| Cargo.lock | Locks the new dependency. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
For reference the referred to commit is since the gitlab runner v0.0.8 crate release . So this has been in the lava gitlab runner since release v0.3.0, which is widely deployed and used. I didn't see complaints there but the artifact that is uses is somewaht minimal as well The OBS gitlab runner never had a release so any kind of artifact filtering will be new for it. So wherever you say we weren't doing this or that it's specific to that runner ;) |
|
@copilot Please consider the changes this PR introduces and add a comment with an example table where there would be functional differences from a users point of view (iotw. where the wildcards now have a different meaning) |
Relative to the filtering added in v0.0.8, the user-visible differences are mainly the cases where
There are also two non-wildcard behavior changes worth calling out:
|
@copilot please further extend this table with items that are unchanged and work in your non-wildcard behaviours as well |
|
|
So specifically lava-gitlab-runner only generates file named |
|
For obs-gitlab-runner; it seems at least in the Collabora usage we primarily use the artifact path matching as part of the job generated by the |
|
Now that i understand what's going on can you update your commit message a bit to say e.g. "Match gitlab-runner behaviour" or somesuch. As nothing was really broken it just didn't match. Would also be nice to explain the impact a la the comparison table copilot create as that's a lot more clear (though your explanation as complete). Also if you have a listing, don't add extra newlines, that makes it render in odd ways :) |
7013664 to
292af12
Compare
292af12 to
93ddead
Compare
(I'm not really sure what this means?) |
Older versions of this crate did not filter uploaded artifacts at all, regardless of the value of `artifacts` the user gave. In commit eccfdbc, artifact filtering was added, by treating `artifacts` entries as glob patterns and matching them against the artifact paths the runner instances proposed. However, gitlab-runner itself instead uses doublestar to traverse the current tree for artifacts, which has a few notable implications: - Paths are normalized first with filepath.Clean & filepath.ToSlash: https://github.com/bmatcuk/doublestar/blob/a9ad9e0ef4d6b7e4443090e9a7201d847a881711/utils.go#L74-L80 So `file` and `./file//` are both the same pattern and will match `file`. (We don't care too much about ToSlash for now, since that should only have an effect for the Windows path separator). - If a pattern matches a directory, the full subtree is added to the archive. - Wildcards `*` and `?` retain the usual filesystem-level semantics of not crossing path components. These behaviors do not line up with the current glob matches, which will only do an exact match on the full artifact path with the default glob options (`require_literal_separator: false`). The result is: - All globs are effectively recursive globs, so `a/b/c` will be matched by any of `a/*`, `*b*`, `a?b?c`. - You need to add an explicit trailing wildcard `/*` to directories, otherwise none of their contents get included. - e.g. `dir` or `dir/` will not match `dir/file`, as it will be looking for full paths named `dir` or `dir/` (including the trailing slash, due to the lack of normalization). To fix this, both the declared artifact path and UploadableFile's path (for consistency) are normalized, and then the patterns are matched on successive ancestors of the current path, while telling `glob` that it needs to give the `/` special treatment. Technically this leaves some other behavior on the table, such as support for brace expansion, or the aforementioned Windows separator support, or handling absolute paths: https://gitlab.com/gitlab-org/gitlab-runner/-/blob/fc6a8943d99928a9d293186be03674c333fd09fa/commands/helpers/file_archiver.go#L191 But that's all leaning more towards fringe functionality that we probably don't care about here (at least not for now).
93ddead to
e629737
Compare
This contains fixes for logs being blank in some GitLab versions: collabora/gitlab-runner-rs#125 https://docs.gitlab.com/releases/patches/patch-release-gitlab-19-0-2-released/ as well as improved support for artifact paths that now matches GitLab's usual semantics: collabora/gitlab-runner-rs#126 Fixes #119.
Older versions of gitlab-runner did not filter uploaded artifacts at all, regardless of the value of
artifactsthe user gave. In commit eccfdbc, artifact filtering was added, by treatingartifactsentries as glob patterns and matching them against the artifact paths the runner instances proposed. However, gitlab-runner itself instead uses doublestar to traverse the current tree for artifacts, which has a few notable implications:Paths are normalized first with filepath.Clean & filepath.ToSlash:
https://github.com/bmatcuk/doublestar/blob/a9ad9e0ef4d6b7e4443090e9a7201d847a881711/utils.go#L74-L80
So
fileand./file//are both the same pattern and will matchfile. (We don't care too much about ToSlash for now, since that should only have an effect for the Windows path separator).If a pattern matches a directory, the full subtree is added to the archive.
Wildcards
*and?retain the usual filesystem-level semantics of not crossing path components.These behaviors do not line up with the current glob matches, which will only do an exact match on the full artifact path with the default glob options (
require_literal_separator: false). The result is:a/b/cwill be matched by any ofa/*,*b*,a?b?c./*to directories, otherwise none of their contents get included.dirordir/will not matchdir/file, as it will be looking for full paths nameddirordir/(including the trailing slash, due to the lack of normalization).To fix this, both the declared artifact path and UploadableFile's path (for consistency) are normalized, and then the patterns are matched on successive ancestors of the current path, while telling
globthat it needs to give the/special treatment.Technically this leaves some other gitlab-runner behavior on the table, such as support for brace expansion, or the aforementioned Windows separator support, or handling absolute paths:
https://gitlab.com/gitlab-org/gitlab-runner/-/blob/fc6a8943d99928a9d293186be03674c333fd09fa/commands/helpers/file_archiver.go#L191
But that's all leaning more towards fringe functionality that we probably don't care about here (at least not for now).