feat(BassetlawDistrictCouncil): add new scraper using ReCollect API by InertiaUK · Pull Request #2049 · robbrad/UKBinCollectionData

InertiaUK · 2026-05-12T10:19:35Z

Summary

New scraper for Bassetlaw District Council using the ReCollect API (area BassetlawUK, service 50015)
Multi-strategy address resolution: full address text search, postcode qualifier expansion, and combined paon+postcode fallback
Test address: 2 Albert Road, DN22 6JD (verified returning Green Waste, Blue Recycling, and Glass bins)

Summary by CodeRabbit

New Features
- Added support for Bassetlaw District Council bin collection schedules.
Tests
- Updated test configuration to include Bassetlaw District Council and refined North Hertfordshire District Council test data.

Uses ReCollect API (area BassetlawUK, service 50015) with multi-strategy address resolution: full address text, postcode qualifier expansion, and combined paon+postcode search. Test address: 2 Albert Road, DN22 6JD Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

coderabbitai · 2026-05-12T10:19:49Z

Warning

Review limit reached

@InertiaUK, we couldn't start this review because you've used your available PR reviews for now.

Your plan currently allows 2 reviews/hour. Refill in 23 minutes and 59 seconds.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more review capacity refills, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0d340679-9df5-4939-bc20-6483ff3d0b4b

📥 Commits

Reviewing files that changed from the base of the PR and between cf6b757 and a8aba99.

📒 Files selected for processing (1)

uk_bin_collection/uk_bin_collection/councils/BassetlawDistrictCouncil.py

📝 Walkthrough

Walkthrough

This PR adds a new Bassetlaw District Council integration module that fetches bin collection data via the Recollect API, resolving addresses to place IDs through multiple fallback strategies and returning scheduled pickups over a 60-day window. Test fixtures are added to support the integration.

Changes

Bassetlaw District Council Integration

Layer / File(s)	Summary
Bassetlaw council integration module `uk_bin_collection/uk_bin_collection/councils/BassetlawDistrictCouncil.py`	HTTP headers and configuration constants (AREA, SERVICE) establish the module foundation. Address-to-place_id resolution uses three staged Recollect API strategies: direct PAON text search, postcode-based lookup with parcel/qualifier matching, and combined PAON+postcode fallback. If no place_id is found, `ValueError` is raised. Event fetching queries the Recollect events endpoint over a 60-day window, filters to `pickup` events, extracts bin types from event flags, deduplicates by (bin_type, collection_date), sorts chronologically, and returns the structured bin data object.
Test fixture for Bassetlaw `uk_bin_collection/tests/input.json`	BassetlawDistrictCouncil test entry defines LAD24CD, house number, postcode, skip_get_url flag, empty UPRN, base URL, and wiki name for integration testing.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A new warren joins the fold,
Bassetlaw's bins in Recollect's hold,
Three address paths and sixty days wide,
Events are sorted, duplicates denied!
Test fixtures rest, all ready to play. 🗑️

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat(BassetlawDistrictCouncil): add new scraper using ReCollect API' directly and clearly describes the main change: adding a new scraper for Bassetlaw District Council using the ReCollect API.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-05-12T10:22:12Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.67%. Comparing base (8ecf878) to head (a8aba99).
✅ All tests successful. No failed tests found.

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #2049   +/-   ##
=======================================
  Coverage   86.67%   86.67%           
=======================================
  Files           9        9           
  Lines        1141     1141           
=======================================
  Hits          989      989           
  Misses        152      152

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@uk_bin_collection/uk_bin_collection/councils/BassetlawDistrictCouncil.py`:
- Around line 122-128: The parsing silently skips malformed payloads: change the
logic around events_data, the events loop and the day check to fail fast by
validating the upstream contract and raising exceptions when it's violated;
specifically, in the method that assigns events_data (variable events_data)
ensure the "events" key exists and is a list (raise ValueError/RuntimeError if
missing or wrong type instead of using events_data.get("events", []")), and
inside the for loop that iterates events, validate each event contains a
non-empty "day" field (raise an exception mentioning the offending event when
day is missing) so the scraper surfaces ReCollect contract changes rather than
producing incomplete bins.
- Around line 21-22: parse_data currently only reads paon from kwargs, so
user_paon becomes None when the fixture supplies house_number; update the
assignment to read house_number as a fallback (e.g., set user_paon =
kwargs.get("paon") or kwargs.get("house_number")) so the subsequent matching
logic in parse_data (uses of user_paon around the postcode matching and
selection at the branches that pick results) will prefer house_number when paon
is absent; ensure the same fallback is used wherever user_paon is referenced in
this class.
- Around line 45-88: The code currently defaults to the first candidate when no
PAON match is found (assigning place_id = parcels[0]["place_id"] and place_id =
addresses[0]["place_id"]) and also ignores multiple qualifiers; instead, detect
ambiguity and fail loudly: when more than one parcel/qualifier/address exists
and no explicit PAON match was found, raise a clear exception (e.g., ValueError
or a custom AddressAmbiguousError) rather than selecting the first entry; update
the logic in BassetlawDistrictCouncil.py around the parcels, qualifiers and
addresses handling (the blocks that set place_id from parcels, qualifiers[0],
and addresses) to return/raise an error indicating an ambiguous postcode so
callers can handle it explicitly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 55d53489-61ca-49c1-85f0-3d89d5a3250d

📥 Commits

Reviewing files that changed from the base of the PR and between 8ecf878 and cf6b757.

📒 Files selected for processing (2)

uk_bin_collection/tests/input.json
uk_bin_collection/uk_bin_collection/councils/BassetlawDistrictCouncil.py

coderabbitai · 2026-05-12T10:24:02Z

+            # If postcode returns parcels directly, match by house number
+            parcels = [r for r in results if r.get("type") == "parcel" and r.get("place_id")]
+            if parcels:
+                if user_paon:
+                    for p in parcels:
+                        name = p.get("name", "")
+                        if name.lower().startswith(user_paon.lower()):
+                            place_id = p["place_id"]
+                            break
+                if not place_id and parcels:
+                    place_id = parcels[0]["place_id"]
+
+            # If postcode returns qualifiers, get all addresses for that postcode
+            if not place_id:
+                qualifiers = [r for r in results if r.get("type") == "place_qualifier"]
+                if qualifiers:
+                    qual = qualifiers[0]
+                    # Fetch all addresses in this postcode area
+                    params2 = {"q": "", "locale": "en-GB", "place_qualifier_id": qual["qualifier_id"]}
+                    response2 = requests.get(suggest_url, headers=HEADERS, params=params2, timeout=30)
+                    response2.raise_for_status()
+                    addresses = [r for r in response2.json() if r.get("type") == "parcel" and r.get("place_id")]
+
+                    if not addresses:
+                        # Qualifier empty search fails — try with postcode text inside qualifier
+                        params3 = {"q": postcode, "locale": "en-GB", "place_qualifier_id": qual["qualifier_id"]}
+                        response3 = requests.get(suggest_url, headers=HEADERS, params=params3, timeout=30)
+                        response3.raise_for_status()
+                        addresses = [r for r in response3.json() if r.get("type") == "parcel" and r.get("place_id")]
+
+                    if addresses and user_paon:
+                        for addr in addresses:
+                            name = addr.get("name", "")
+                            if name.lower().startswith(user_paon.lower()):
+                                place_id = addr["place_id"]
+                                break
+                            # Match house number at start of address
+                            num_match = re.match(r"^(\d+[a-zA-Z]?)\b", user_paon.strip())
+                            if num_match and name.startswith(num_match.group(1) + " "):
+                                place_id = addr["place_id"]
+                                break
+
+                    if not place_id and addresses:
+                        place_id = addresses[0]["place_id"]


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't default ambiguous postcode matches to the first candidate.

Line 55 and Line 88 silently pick the first returned parcel/address when no exact PAON match is found. On a multi-property postcode, that can return a valid-looking schedule for the wrong household. The same ambiguity exists at Line 61 when multiple qualifiers are returned.

Suggested change

- if not place_id and parcels: - place_id = parcels[0]["place_id"] + if not place_id and len(parcels) == 1: + place_id = parcels[0]["place_id"] + elif not place_id and parcels: + raise ValueError( + f"Ambiguous postcode match for {postcode}; please provide a more specific address" + ) - if qualifiers: + if len(qualifiers) == 1: qual = qualifiers[0] + elif qualifiers: + raise ValueError( + f"Ambiguous postcode qualifier match for {postcode}" + ) + else: + qual = None - if not place_id and addresses: - place_id = addresses[0]["place_id"] + if not place_id and len(addresses) == 1: + place_id = addresses[0]["place_id"] + elif not place_id and addresses: + raise ValueError( + f"Ambiguous address match for {postcode}; please provide a more specific address" + )

Based on learnings: In uk_bin_collection/**/*.py, when parsing council bin collection data, prefer explicit failures (raise exceptions on unexpected formats) over silent defaults or swallowed errors.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@uk_bin_collection/uk_bin_collection/councils/BassetlawDistrictCouncil.py` around lines 45 - 88, The code currently defaults to the first candidate when no PAON match is found (assigning place_id = parcels[0]["place_id"] and place_id = addresses[0]["place_id"]) and also ignores multiple qualifiers; instead, detect ambiguity and fail loudly: when more than one parcel/qualifier/address exists and no explicit PAON match was found, raise a clear exception (e.g., ValueError or a custom AddressAmbiguousError) rather than selecting the first entry; update the logic in BassetlawDistrictCouncil.py around the parcels, qualifiers and addresses handling (the blocks that set place_id from parcels, qualifiers[0], and addresses) to return/raise an error indicating an ambiguous postcode so callers can handle it explicitly.

coderabbitai · 2026-05-12T10:24:02Z

+        events_data = response.json()
+
+        seen = set()
+        for event in events_data.get("events", []):
+            day = event.get("day")
+            if not day:
+                continue


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Raise on malformed event payloads instead of silently dropping them.

events_data.get("events", []) and if not day: continue turn upstream contract drift into an empty or incomplete bins list. If ReCollect changes the payload shape, this scraper should fail fast rather than under-report collections.

Suggested change

- events_data = response.json() + events_data = response.json() + events = events_data.get("events") + if not isinstance(events, list): + raise ValueError( + f"Unexpected events payload for place_id {place_id}: missing 'events' list" + ) seen = set() - for event in events_data.get("events", []): + for event in events: day = event.get("day") if not day: - continue + raise ValueError( + f"Unexpected event payload for place_id {place_id}: missing 'day'" + )

Based on learnings: In uk_bin_collection/**/*.py, when parsing council bin collection data, prefer explicit failures (raise exceptions on unexpected formats) over silent defaults or swallowed errors.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@uk_bin_collection/uk_bin_collection/councils/BassetlawDistrictCouncil.py` around lines 122 - 128, The parsing silently skips malformed payloads: change the logic around events_data, the events loop and the day check to fail fast by validating the upstream contract and raising exceptions when it's violated; specifically, in the method that assigns events_data (variable events_data) ensure the "events" key exists and is a list (raise ValueError/RuntimeError if missing or wrong type instead of using events_data.get("events", []")), and inside the for loop that iterates events, validate each event contains a non-empty "day" field (raise an exception mentioning the offending event when day is missing) so the scraper surfaces ReCollect contract changes rather than producing incomplete bins.

coderabbitai Bot reviewed May 12, 2026

View reviewed changes

fix: address CodeRabbit review feedback

a8aba99

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(BassetlawDistrictCouncil): add new scraper using ReCollect API#2049

feat(BassetlawDistrictCouncil): add new scraper using ReCollect API#2049
InertiaUK wants to merge 2 commits into
robbrad:masterfrom
InertiaUK:fix/bassetlaw-test-postcode

InertiaUK commented May 12, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 12, 2026 •

edited

Loading

Review limit reached

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

codecov Bot commented May 12, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot May 12, 2026

Uh oh!

coderabbitai Bot May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

InertiaUK commented May 12, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

codecov Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

InertiaUK commented May 12, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 12, 2026 •

edited

Loading

codecov Bot commented May 12, 2026 •

edited

Loading