Skip to content

feat: add postcode lookup for south derbyshire district council#2065

Open
InertiaUK wants to merge 3 commits into
robbrad:masterfrom
InertiaUK:feat/south-derbyshire-postcode-lookup
Open

feat: add postcode lookup for south derbyshire district council#2065
InertiaUK wants to merge 3 commits into
robbrad:masterfrom
InertiaUK:feat/south-derbyshire-postcode-lookup

Conversation

@InertiaUK
Copy link
Copy Markdown
Contributor

@InertiaUK InertiaUK commented May 12, 2026

Summary

  • Users can provide either UPRN or postcode + house number
  • UPRN takes priority when provided (backward compatible)
  • Uses iShareLIVE LocationSearch endpoint for postcode-to-address resolution
  • Matches by house number/name with fallback to first result
  • Falls back to first result if no match found

Testing

  • UPRN path (backward compat): DE11 7AB + UPRN 10000820668
  • Postcode + house number: DE11 7AB + paon 2
  • Tested via API end-to-end ✅

Summary by CodeRabbit

  • Improvements
    • South Derbyshire integration now accepts postcode + house number (UPRN optional) for address lookup, making bin schedule lookups more flexible.
  • Bug Fixes
    • More reliable retrieval and parsing of collection data with clearer error feedback when an address cannot be resolved.

Review Change Stack

Users can provide either UPRN or postcode + house number.
UPRN takes priority when provided (backward compatible).
Uses iShareLIVE LocationSearch endpoint for postcode-to-address
resolution, then matches by house number/name.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 12, 2026

📝 Walkthrough

Walkthrough

SouthDerbyshireDistrictCouncil scraper now resolves UPRN via LocationSearch (postcode → match), then fetches bin collections by UPRN from iShareLIVE, unwrapping JSONP and extracting dates/types from embedded HTML. Test config updated to prefer postcode and house_number inputs.

Changes

South Derbyshire UPRN Resolution & Bin Lookup

Layer / File(s) Summary
Endpoint Constants & Address Matcher
uk_bin_collection/uk_bin_collection/councils/SouthDerbyshireDistrictCouncil.py
Module-level SEARCH_URL, BIN_URL, and HEADERS added. _match_address() selects a UPRN from LocationSearch results by UPRN or normalized PAON/house-number match, otherwise returns first result.
UPRN Resolution from Postcode
uk_bin_collection/uk_bin_collection/councils/SouthDerbyshireDistrictCouncil.py
parse_data() resolves user_uprn from postcode when uprn is missing by querying SEARCH_URL and applying _match_address; raises ValueError if resolution fails.
Bins Fetch & JSONP/HTML Parsing
uk_bin_collection/uk_bin_collection/councils/SouthDerbyshireDistrictCouncil.py
Builds bins URL using resolved UPRN, requests with shared headers (verify=False, 30s timeout), unwraps JSONP via import((...)) regex, parses JSON and embedded HTML, and extracts collection type and collectionDate entries.
Test Configuration
uk_bin_collection/tests/input.json
SouthDerbyshireDistrictCouncil entry updated: url set to base endpoint, wiki_command_url_override removed, wiki_note updated to request postcode + house_number (with uprn optional), and postcode/house_number fields added.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 A postcode hops into the search,
UPRN pops up with a cheerful twitch,
JSONP peels off its coat by regex light,
HTML whispers dates into the night,
South Derbyshire bins bound into sight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: add postcode lookup for south derbyshire district council' clearly describes the main feature addition: enabling postcode lookup functionality for the South Derbyshire District Council council module.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.67%. Comparing base (8ecf878) to head (5e37778).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2065   +/-   ##
=======================================
  Coverage   86.67%   86.67%           
=======================================
  Files           9        9           
  Lines        1141     1141           
=======================================
  Hits          989      989           
  Misses        152      152           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@uk_bin_collection/uk_bin_collection/councils/SouthDerbyshireDistrictCouncil.py`:
- Around line 37-39: The code is reading kwargs.get("paon") but the incoming
fixtures/docs use "house_number", so update the parameter extraction in the
SouthDerbyshireDistrictCouncil class (where user_postcode and user_paon are
defined) to accept both keys: read kwargs.get("paon") and fallback to
kwargs.get("house_number") (or vice‑versa) so user_paon contains the house
number if provided; also ensure the postcode matching logic that currently falls
back to results[0] when matching (the block that uses user_paon to disambiguate
results) uses this normalized user_paon value to compare before defaulting to
the first result.
- Around line 71-91: The code silently returns an empty data["bins"] when the
JSONP regex or expected payload shape is missing; update the parsing block
around json_match, parsed_data and html_content to validate each step and raise
a clear exception if the regex fails, if parsed_data lacks the "Results" ->
"Next_Bin_Collections" -> "_" key, or if re.findall returns no matches — include
contextual details in the exception so callers can detect upstream format
changes instead of getting an empty bins list.
- Around line 40-41: Remove the global suppression of TLS warnings and re-enable
certificate verification in the SouthDerbyshireDistrictCouncil scraper: delete
the call to requests.packages.urllib3.disable_warnings() and change any requests
calls in this class (e.g., the GET/POST calls inside
SouthDerbyshireDistrictCouncil methods that currently pass verify=False) to use
verify=True or omit the verify parameter entirely so default verification is
used; ensure no other warning-suppression calls remain in this class.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 89680c95-9243-482a-b5fa-4ab14b1f8bd8

📥 Commits

Reviewing files that changed from the base of the PR and between 8ecf878 and 3964584.

📒 Files selected for processing (2)
  • uk_bin_collection/tests/input.json
  • uk_bin_collection/uk_bin_collection/councils/SouthDerbyshireDistrictCouncil.py

Comment thread uk_bin_collection/uk_bin_collection/councils/SouthDerbyshireDistrictCouncil.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
uk_bin_collection/uk_bin_collection/councils/SouthDerbyshireDistrictCouncil.py (1)

78-91: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Raise when no collection rows are extracted.

This still returns {"bins": []} when the HTML shape changes enough for re.findall(...) to miss, which hides a scraper break as a valid empty result.

Suggested fix
         matches = re.findall(
             r"<span.*?>(\d{2} \w+ \d{4})</span>.*?<span.*?>(.*?)</span>",
             html_content,
             re.S,
         )
+
+        if not matches:
+            raise ValueError(
+                "Unexpected South Derbyshire bin HTML format; no collection entries found"
+            )
 
         for match in matches:
Based on learnings: prefer explicit failures (raise exceptions on unexpected formats) over silent defaults or swallowed errors.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@uk_bin_collection/uk_bin_collection/councils/SouthDerbyshireDistrictCouncil.py`
around lines 78 - 91, The code currently silently returns an empty data["bins"]
when re.findall(...) (stored in matches) finds no rows; update the parsing block
in SouthDerbyshireDistrictCouncil (the method that uses html_content and
matches) to explicitly fail when matches is empty by raising a descriptive
exception (e.g., ValueError or a ScraperError) that notes "no collection rows
extracted" and includes context such as the length of html_content or a small
snippet for debugging; keep the rest of the loop unchanged so valid matches
still populate data["bins"].
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In
`@uk_bin_collection/uk_bin_collection/councils/SouthDerbyshireDistrictCouncil.py`:
- Around line 78-91: The code currently silently returns an empty data["bins"]
when re.findall(...) (stored in matches) finds no rows; update the parsing block
in SouthDerbyshireDistrictCouncil (the method that uses html_content and
matches) to explicitly fail when matches is empty by raising a descriptive
exception (e.g., ValueError or a ScraperError) that notes "no collection rows
extracted" and includes context such as the length of html_content or a small
snippet for debugging; keep the rest of the loop unchanged so valid matches
still populate data["bins"].

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e98b81a3-37e2-4234-8b16-279edc0600a4

📥 Commits

Reviewing files that changed from the base of the PR and between 3964584 and 5e37778.

📒 Files selected for processing (1)
  • uk_bin_collection/uk_bin_collection/councils/SouthDerbyshireDistrictCouncil.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant