Skip to content

feat: DumfriesandGallowayCouncil - add postcode + house number lookup#2063

Closed
InertiaUK wants to merge 2 commits into
robbrad:masterfrom
InertiaUK:feat/dumfries-postcode-lookup
Closed

feat: DumfriesandGallowayCouncil - add postcode + house number lookup#2063
InertiaUK wants to merge 2 commits into
robbrad:masterfrom
InertiaUK:feat/dumfries-postcode-lookup

Conversation

@InertiaUK

@InertiaUK InertiaUK commented May 12, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Users can now provide either UPRN or postcode + house number/name — whichever they have.
  • UPRN takes priority when provided (backward compatible with existing users).
  • When no UPRN is given, searches the council`s LocalGov Drupal page by postcode. The address dropdown contains UPRNs as option values — matched by house number/name.
  • Falls back to first address if no match found.

How it works

  1. If UPRN provided → use directly (existing behavior)
  2. If no UPRN → GET /waste-collection-schedule/find?postcode=X → parse <select name="uprn"> → match by house number in option text
  3. Use resolved UPRN to download ICS calendar via existing endpoint

Testing

  • UPRN path (backward compat): UPRN 137005550
  • Postcode + house name: DG7 1AA + paon Glenafton
  • Tested directly on VPS ✅

Summary by CodeRabbit

  • Bug Fixes
    • Dumfries & Galloway lookup now derives addresses from postcode + house name/number when UPRN isn’t provided, fetches calendar data directly, and skips invalid/non-calendar responses for more reliable collection schedules.
  • Documentation
    • Updated guidance: postcode + house number accepted (UPRN optional) and input instructions clarified for Dumfries & Galloway.

Review Change Stack

@coderabbitai

coderabbitai Bot commented May 12, 2026

Copy link
Copy Markdown
Contributor

Warning

Review limit reached

@InertiaUK, we couldn't start this review because you've used your available PR reviews for now.

Your plan currently allows 2 reviews/hour. Refill in 19 minutes and 21 seconds.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more review capacity refills, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 06766f81-706f-4f74-94db-3dab99f06c7a

📥 Commits

Reviewing files that changed from the base of the PR and between 78948ae and d21cb0c.

📒 Files selected for processing (2)
  • uk_bin_collection/tests/input.json
  • uk_bin_collection/uk_bin_collection/councils/DumfriesandGallowayCouncil.py
📝 Walkthrough

Walkthrough

Refactors Dumfries and Galloway council scraper to derive UPRNs from postcode address dropdowns and fetch ICS calendars directly via HTTP. Adds _resolve_uprn() and council SEARCH_URL/ICS_BASE constants; parse_data() now resolves UPRN, downloads and validates ICS content, parses events within 60 days, and returns the data dict.

Changes

Dumfries and Galloway UPRN Resolution and ICS Fetch

Layer / File(s) Summary
UPRN Resolution Infrastructure
uk_bin_collection/uk_bin_collection/councils/DumfriesandGallowayCouncil.py
Adds SEARCH_URL and ICS_BASE constants and _resolve_uprn(postcode, uprn=None, paon=None) which posts a postcode to the council endpoint, parses returned <option> entries to extract address/UPRN pairs, and returns the provided UPRN or a derived UPRN (first postcode match or PAON-filtered match).
Parse Data Refactor
uk_bin_collection/uk_bin_collection/councils/DumfriesandGallowayCouncil.py
CouncilClass.parse_data() now calls _resolve_uprn() with postcode/paon/uprn, constructs the ICS URL from ICS_BASE + UPRN, fetches ICS via requests, rejects HTML/invalid responses or missing VCALENDAR, parses events in the next 60 days into bins, and returns the data dict.
Tests / Fixtures Update
uk_bin_collection/tests/input.json
Updates the DumfriesandGallowayCouncil test input block to provide postcode and house_number, updates the uprn value, and changes wiki_note to indicate postcode+house number are sufficient (UPRN optional).

Sequence Diagram(s)

sequenceDiagram
    participant parse_data
    participant _resolve_uprn
    participant SEARCH_URL
    participant requests
    participant vcal_parser
    parse_data->>_resolve_uprn: resolve_uprn(postcode, uprn, paon)
    _resolve_uprn->>SEARCH_URL: POST postcode -> receive address options
    SEARCH_URL-->>_resolve_uprn: HTML option list with UPRNs
    _resolve_uprn-->>parse_data: resolved UPRN
    parse_data->>requests: GET ICS_BASE + UPRN
    requests-->>parse_data: ICS or HTML response
    parse_data->>parse_data: validate (reject HTML / require VCALENDAR)
    parse_data->>vcal_parser: parse ICS events within 60 days
    vcal_parser-->>parse_data: events -> build bins
    parse_data-->>parse_data: return data dict
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 From dropdowns and postcodes I hop and I peep,
I fetch the UPRN without Selenium leap,
ICS files parsed, events tidy and neat,
Sixty days of bins all lined up to meet,
A carrot for tests and a hop of delight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: DumfriesandGallowayCouncil - add postcode + house number lookup' directly and clearly summarizes the main change: adding postcode and house number lookup capability to the Dumfries and Galloway scraper.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov

codecov Bot commented May 12, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.67%. Comparing base (8ecf878) to head (d21cb0c).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2063   +/-   ##
=======================================
  Coverage   86.67%   86.67%           
=======================================
  Files           9        9           
  Lines        1141     1141           
=======================================
  Hits          989      989           
  Misses        152      152           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
uk_bin_collection/uk_bin_collection/councils/DumfriesandGallowayCouncil.py (1)

51-51: 💤 Low value

Dead code from Selenium removal.

driver is always None and the finally block that calls driver.quit() is unreachable. This is leftover from the Selenium-based implementation.

Proposed cleanup
     def parse_data(self, page: str, **kwargs) -> dict:
-        driver = None
         try:
             data = {"bins": []}
             ...
         except Exception as e:
             print(f"An error occurred: {e}")
             raise
-        finally:
-            if driver:
-                driver.quit()

         return data

Also applies to: 89-91

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@uk_bin_collection/uk_bin_collection/councils/DumfriesandGallowayCouncil.py`
at line 51, Remove dead Selenium leftovers: delete the unused driver variable
initialization and any finally blocks or calls to driver.quit() in the
DumfriesandGallowayCouncil scraping flow (search for the variable driver and any
finally blocks referencing driver.quit()). Ensure no other code expects driver
to exist; if cleanup logic is still needed, replace driver.quit() calls with
appropriate teardown for the current HTTP/requests-based implementation or
remove them entirely in the methods where driver was previously used (e.g.,
inside the main scraping function and the blocks around lines showing driver
initialization and the finally block).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@uk_bin_collection/uk_bin_collection/councils/DumfriesandGallowayCouncil.py`:
- Around line 36-46: The code currently falls back to options[0][0] when a
provided paon doesn't match any dropdown entry; change this to raise an explicit
exception instead: if paon is truthy and no matching val is found after both
loops, raise a ValueError (or a CouncilLookupError if you have one) with a clear
message including the paon and available option texts; keep the existing
behavior of returning the matched val when found and only raise when paon was
provided but not matched (do not change behavior when paon is falsy).
- Around line 63-72: The code silently returns an empty data when the fetched
ICS appears invalid and redundantly re-fetches the same URL; remove the local
"import requests as req" and use the module-level requests client (ics_resp),
call ics_resp.raise_for_status() then validate ics_text and if invalid raise a
descriptive exception (include URL and status) instead of returning data, and
pass the already-fetched calendar to events by calling
events(string_content=ics_text, start=now, end=future) rather than
events(ics_url,...), keeping references to ics_resp, ics_text, events(),
string_content, and data to locate changes.

---

Nitpick comments:
In `@uk_bin_collection/uk_bin_collection/councils/DumfriesandGallowayCouncil.py`:
- Line 51: Remove dead Selenium leftovers: delete the unused driver variable
initialization and any finally blocks or calls to driver.quit() in the
DumfriesandGallowayCouncil scraping flow (search for the variable driver and any
finally blocks referencing driver.quit()). Ensure no other code expects driver
to exist; if cleanup logic is still needed, replace driver.quit() calls with
appropriate teardown for the current HTTP/requests-based implementation or
remove them entirely in the methods where driver was previously used (e.g.,
inside the main scraping function and the blocks around lines showing driver
initialization and the finally block).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8cdb7591-6009-4518-b0c8-f60acaee4342

📥 Commits

Reviewing files that changed from the base of the PR and between 8ecf878 and c17b254.

📒 Files selected for processing (1)
  • uk_bin_collection/uk_bin_collection/councils/DumfriesandGallowayCouncil.py

Comment on lines 63 to 72
import requests as req
ics_resp = req.get(ics_url, timeout=30)
ics_text = ics_resp.text
if "<br" in ics_text or "<html" in ics_text.lower() or "VCALENDAR" not in ics_text:
return data

# Get events from ICS file within the next 60 days
now = datetime.now()
future = now + timedelta(days=60)

# Parse ICS calendar
upcoming_events = events(ics_url, start=now, end=future)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

Does icalevents Python library support string_content parameter for events function?

💡 Result:

Yes, the icalevents Python library supports a string_content parameter for the events function. The official documentation lists the signature as: icalevents.icalevents.events(url: str | None = None, file: str | Path | None = None, string_content: bytes | str | None = None, start: datetime | None = None, end: datetime | None = None, fix_apple: bool = False, http: PoolManager | None = None, tzinfo: tzinfo | None = None, sort: bool = False, strict: bool = False) → list[Event], where string_content is described as "iCal content as string" [1]. This is corroborated by the source code in icalparser.py, which includes a parse_events function taking content: str [2]. The library's latest release is 0.3.1 as of 2025-09-26 [3][4][5].

Citations:


🏁 Script executed:

head -n 20 uk_bin_collection/uk_bin_collection/councils/DumfriesandGallowayCouncil.py

Repository: robbrad/UKBinCollectionData

Length of output: 1969


🏁 Script executed:

sed -n '49,72p' uk_bin_collection/uk_bin_collection/councils/DumfriesandGallowayCouncil.py

Repository: robbrad/UKBinCollectionData

Length of output: 918


🏁 Script executed:

# Check if requests is imported at module level
grep -n "^import requests\|^from requests" uk_bin_collection/uk_bin_collection/councils/DumfriesandGallowayCouncil.py | head -20

Repository: robbrad/UKBinCollectionData

Length of output: 89


🏁 Script executed:

grep -n "^from\|^import" uk_bin_collection/uk_bin_collection/councils/DumfriesandGallowayCouncil.py | head -30

Repository: robbrad/UKBinCollectionData

Length of output: 380


Silent empty return masks errors; ICS content fetched twice.

Two issues in this segment:

  1. Lines 66-67: Returning empty data silently when the ICS response appears invalid hides failures. Prefer raising an explicit exception to surface problems.

  2. Line 72: events(ics_url, ...) re-fetches the URL already retrieved on line 64. Use the string_content parameter instead to avoid the redundant request.

Additionally, line 63 unnecessarily re-imports requests when it's already imported at the module level.

Proposed fix
-            import requests as req
-            ics_resp = req.get(ics_url, timeout=30)
+            ics_resp = requests.get(ics_url, timeout=30)
+            ics_resp.raise_for_status()
             ics_text = ics_resp.text
-            if "<br" in ics_text or "<html" in ics_text.lower() or "VCALENDAR" not in ics_text:
-                return data
+            if "VCALENDAR" not in ics_text:
+                raise ValueError(f"Invalid ICS response for UPRN {resolved_uprn}")

             now = datetime.now()
             future = now + timedelta(days=60)

-            upcoming_events = events(ics_url, start=now, end=future)
+            upcoming_events = events(string_content=ics_text, start=now, end=future)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@uk_bin_collection/uk_bin_collection/councils/DumfriesandGallowayCouncil.py`
around lines 63 - 72, The code silently returns an empty data when the fetched
ICS appears invalid and redundantly re-fetches the same URL; remove the local
"import requests as req" and use the module-level requests client (ics_resp),
call ics_resp.raise_for_status() then validate ics_text and if invalid raise a
descriptive exception (include URL and status) instead of returning data, and
pass the already-fetched calendar to events by calling
events(string_content=ics_text, start=now, end=future) rather than
events(ics_url,...), keeping references to ics_resp, ics_text, events(),
string_content, and data to locate changes.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@uk_bin_collection/tests/input.json`:
- Line 703: The UPRN fixture value in uk_bin_collection/tests/input.json is
incorrect (currently "137005550") and should match the PR’s validated sample
"137034556"; update the "uprn" field in that JSON fixture to "137034556" so the
backward-compatible UPRN path validation uses the intended test value (locate
the "uprn" key in the test input JSON and replace its string value).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2bbd438e-8e37-4faa-9349-86a7bf1935bf

📥 Commits

Reviewing files that changed from the base of the PR and between c17b254 and 5eb21c8.

📒 Files selected for processing (1)
  • uk_bin_collection/tests/input.json

Comment thread uk_bin_collection/tests/input.json Outdated
UPRN is no longer required. Searches council LocalGov Drupal page by
postcode to resolve UPRN from address dropdown. Backward compatible.
@InertiaUK InertiaUK force-pushed the feat/dumfries-postcode-lookup branch from 5eb21c8 to 78948ae Compare May 12, 2026 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants