feat: add postcode lookup for south derbyshire district council#2065
feat: add postcode lookup for south derbyshire district council#2065InertiaUK wants to merge 3 commits into
Conversation
Users can provide either UPRN or postcode + house number. UPRN takes priority when provided (backward compatible). Uses iShareLIVE LocationSearch endpoint for postcode-to-address resolution, then matches by house number/name. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughSouthDerbyshireDistrictCouncil scraper now resolves UPRN via LocationSearch (postcode → match), then fetches bin collections by UPRN from iShareLIVE, unwrapping JSONP and extracting dates/types from embedded HTML. Test config updated to prefer postcode and house_number inputs. ChangesSouth Derbyshire UPRN Resolution & Bin Lookup
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #2065 +/- ##
=======================================
Coverage 86.67% 86.67%
=======================================
Files 9 9
Lines 1141 1141
=======================================
Hits 989 989
Misses 152 152 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@uk_bin_collection/uk_bin_collection/councils/SouthDerbyshireDistrictCouncil.py`:
- Around line 37-39: The code is reading kwargs.get("paon") but the incoming
fixtures/docs use "house_number", so update the parameter extraction in the
SouthDerbyshireDistrictCouncil class (where user_postcode and user_paon are
defined) to accept both keys: read kwargs.get("paon") and fallback to
kwargs.get("house_number") (or vice‑versa) so user_paon contains the house
number if provided; also ensure the postcode matching logic that currently falls
back to results[0] when matching (the block that uses user_paon to disambiguate
results) uses this normalized user_paon value to compare before defaulting to
the first result.
- Around line 71-91: The code silently returns an empty data["bins"] when the
JSONP regex or expected payload shape is missing; update the parsing block
around json_match, parsed_data and html_content to validate each step and raise
a clear exception if the regex fails, if parsed_data lacks the "Results" ->
"Next_Bin_Collections" -> "_" key, or if re.findall returns no matches — include
contextual details in the exception so callers can detect upstream format
changes instead of getting an empty bins list.
- Around line 40-41: Remove the global suppression of TLS warnings and re-enable
certificate verification in the SouthDerbyshireDistrictCouncil scraper: delete
the call to requests.packages.urllib3.disable_warnings() and change any requests
calls in this class (e.g., the GET/POST calls inside
SouthDerbyshireDistrictCouncil methods that currently pass verify=False) to use
verify=True or omit the verify parameter entirely so default verification is
used; ensure no other warning-suppression calls remain in this class.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 89680c95-9243-482a-b5fa-4ab14b1f8bd8
📒 Files selected for processing (2)
uk_bin_collection/tests/input.jsonuk_bin_collection/uk_bin_collection/councils/SouthDerbyshireDistrictCouncil.py
There was a problem hiding this comment.
♻️ Duplicate comments (1)
uk_bin_collection/uk_bin_collection/councils/SouthDerbyshireDistrictCouncil.py (1)
78-91:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRaise when no collection rows are extracted.
This still returns
{"bins": []}when the HTML shape changes enough forre.findall(...)to miss, which hides a scraper break as a valid empty result.Based on learnings: prefer explicit failures (raise exceptions on unexpected formats) over silent defaults or swallowed errors.Suggested fix
matches = re.findall( r"<span.*?>(\d{2} \w+ \d{4})</span>.*?<span.*?>(.*?)</span>", html_content, re.S, ) + + if not matches: + raise ValueError( + "Unexpected South Derbyshire bin HTML format; no collection entries found" + ) for match in matches:🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@uk_bin_collection/uk_bin_collection/councils/SouthDerbyshireDistrictCouncil.py` around lines 78 - 91, The code currently silently returns an empty data["bins"] when re.findall(...) (stored in matches) finds no rows; update the parsing block in SouthDerbyshireDistrictCouncil (the method that uses html_content and matches) to explicitly fail when matches is empty by raising a descriptive exception (e.g., ValueError or a ScraperError) that notes "no collection rows extracted" and includes context such as the length of html_content or a small snippet for debugging; keep the rest of the loop unchanged so valid matches still populate data["bins"].
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In
`@uk_bin_collection/uk_bin_collection/councils/SouthDerbyshireDistrictCouncil.py`:
- Around line 78-91: The code currently silently returns an empty data["bins"]
when re.findall(...) (stored in matches) finds no rows; update the parsing block
in SouthDerbyshireDistrictCouncil (the method that uses html_content and
matches) to explicitly fail when matches is empty by raising a descriptive
exception (e.g., ValueError or a ScraperError) that notes "no collection rows
extracted" and includes context such as the length of html_content or a small
snippet for debugging; keep the rest of the loop unchanged so valid matches
still populate data["bins"].
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e98b81a3-37e2-4234-8b16-279edc0600a4
📒 Files selected for processing (1)
uk_bin_collection/uk_bin_collection/councils/SouthDerbyshireDistrictCouncil.py
Summary
Testing
DE11 7AB+ UPRN10000820668✅DE11 7AB+ paon2✅Summary by CodeRabbit