fix: implement backend resolution for Docker Hub and Hugging Face searches#754
fix: implement backend resolution for Docker Hub and Hugging Face searches#754ilopezluna wants to merge 2 commits intomainfrom
Conversation
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- In resolveSearchResultBackends, the goroutine closes over the loop variable i (
for i := range resolved { group.Go(func() error { ... resolved[i] ... }) }), which can lead to races and incorrect indexing; capture the index or the SearchResult into a local variable before starting the goroutine. - The table-driven tests that use subtests with t.Parallel (e.g., in backend_resolution_test.go) capture the loop variable directly in the closure; assign
tt := ttinside the loop before t.Run to avoid data races and flaky tests on older Go versions.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In resolveSearchResultBackends, the goroutine closes over the loop variable i (`for i := range resolved { group.Go(func() error { ... resolved[i] ... }) }`), which can lead to races and incorrect indexing; capture the index or the SearchResult into a local variable before starting the goroutine.
- The table-driven tests that use subtests with t.Parallel (e.g., in backend_resolution_test.go) capture the loop variable directly in the closure; assign `tt := tt` inside the loop before t.Run to avoid data races and flaky tests on older Go versions.
## Individual Comments
### Comment 1
<location path="cmd/cli/search/backend_resolution.go" line_range="169-170" />
<code_context>
+ group, workerCtx := errgroup.WithContext(ctx)
+ group.SetLimit(resolveConcurrency)
+
+ for i := range resolved {
+ group.Go(func() error {
+ backend, err := resolve(workerCtx, resolved[i])
+ if err != nil || backend == "" {
</code_context>
<issue_to_address>
**issue (bug_risk):** Loop variable `i` is captured by the goroutine closure, which can lead to races and incorrect backend assignments.
Because Go reuses the loop variable across iterations, the closure may see a different `i` than the one intended when the goroutine runs, so it can read/write the wrong `resolved` entry. Capture `i` in a new variable inside the loop, e.g.
```go
for i := range resolved {
i := i // capture
group.Go(func() error {
backend, err := resolve(workerCtx, resolved[i])
if err != nil || backend == "" {
resolved[i].Backend = backendUnknown
return nil
}
resolved[i].Backend = backend
return nil
})
}
```
Alternatively, take the pointer once and close over that: `for i := range resolved { res := &resolved[i]; group.Go(func() error { /* use res */ }) }`.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| for i := range resolved { | ||
| group.Go(func() error { |
There was a problem hiding this comment.
issue (bug_risk): Loop variable i is captured by the goroutine closure, which can lead to races and incorrect backend assignments.
Because Go reuses the loop variable across iterations, the closure may see a different i than the one intended when the goroutine runs, so it can read/write the wrong resolved entry. Capture i in a new variable inside the loop, e.g.
for i := range resolved {
i := i // capture
group.Go(func() error {
backend, err := resolve(workerCtx, resolved[i])
if err != nil || backend == "" {
resolved[i].Backend = backendUnknown
return nil
}
resolved[i].Backend = backend
return nil
})
}Alternatively, take the pointer once and close over that: for i := range resolved { res := &resolved[i]; group.Go(func() error { /* use res */ }) }.
There was a problem hiding this comment.
Code Review
This pull request introduces a more robust backend resolution mechanism for Docker Hub and Hugging Face searches, which is a great improvement. The code is well-structured, using interfaces and concurrent processing for efficiency. However, I found a critical race condition in the concurrent backend resolution logic that could lead to incorrect results. My review includes a specific comment with a suggested fix for this issue.
| for i := range resolved { | ||
| group.Go(func() error { | ||
| backend, err := resolve(workerCtx, resolved[i]) | ||
| if err != nil || backend == "" { | ||
| resolved[i].Backend = backendUnknown | ||
| return nil | ||
| } | ||
| resolved[i].Backend = backend | ||
| return nil | ||
| }) | ||
| } |
There was a problem hiding this comment.
There is a race condition in this loop due to the capture of the loop variable i in the goroutine's closure. When resolveConcurrency is greater than 1, multiple goroutines will likely operate on the same, incorrect index of the resolved slice, leading to incorrect results and some search results not having their backends resolved.
This is a classic Go pitfall for for loops with goroutines. To fix this, you should create a new variable within the loop scope to capture the correct value of i for each iteration.
The associated tests in dockerhub_test.go and huggingface_test.go use resolveConcurrency: 1, which prevents this race condition from being triggered. It would be beneficial to add a test case with a higher concurrency level to catch such issues in the future.
| for i := range resolved { | |
| group.Go(func() error { | |
| backend, err := resolve(workerCtx, resolved[i]) | |
| if err != nil || backend == "" { | |
| resolved[i].Backend = backendUnknown | |
| return nil | |
| } | |
| resolved[i].Backend = backend | |
| return nil | |
| }) | |
| } | |
| for i := range resolved { | |
| i := i // Capture loop variable | |
| group.Go(func() error { | |
| backend, err := resolve(workerCtx, resolved[i]) | |
| if err != nil || backend == "" { | |
| resolved[i].Backend = backendUnknown | |
| return nil | |
| } | |
| resolved[i].Backend = backend | |
| return nil | |
| }) | |
| } |
There was a problem hiding this comment.
Since this project uses Go 1.22+ (go 1.26.0 in go.mod), loop variables are scoped per-iteration, so the for i := range resolved closure is safe and there's no race condition here.
In fact, adding i := i would trigger the copyloopvar linter rule, which flags unnecessary copies of loop variables in Go 1.22+.
That said, I've added a TestResolveSearchResultBackendsConcurrent test that exercises resolveSearchResultBackends with high concurrency (20 results, full parallelism) to explicitly verify correct concurrent behavior and provide better test coverage.
Fixes #746
This pull request introduces a more robust and accurate mechanism for determining the backend type (e.g.,
llama.cpp,vllm,diffusers, orunknown) for models returned by Docker Hub and HuggingFace search. Instead of inferring backend types from tags, names, or descriptions, the backend is now verified using actual model artifacts or files, and this verification is performed concurrently for search results. The changes also include comprehensive unit tests to ensure correctness of backend resolution.