feat: DumfriesandGallowayCouncil - add postcode + house number lookup#2063
feat: DumfriesandGallowayCouncil - add postcode + house number lookup#2063InertiaUK wants to merge 2 commits into
Conversation
|
Warning Review limit reached
Your plan currently allows 2 reviews/hour. Refill in 19 minutes and 21 seconds. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more review capacity refills, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughRefactors Dumfries and Galloway council scraper to derive UPRNs from postcode address dropdowns and fetch ICS calendars directly via HTTP. Adds ChangesDumfries and Galloway UPRN Resolution and ICS Fetch
Sequence Diagram(s)sequenceDiagram
participant parse_data
participant _resolve_uprn
participant SEARCH_URL
participant requests
participant vcal_parser
parse_data->>_resolve_uprn: resolve_uprn(postcode, uprn, paon)
_resolve_uprn->>SEARCH_URL: POST postcode -> receive address options
SEARCH_URL-->>_resolve_uprn: HTML option list with UPRNs
_resolve_uprn-->>parse_data: resolved UPRN
parse_data->>requests: GET ICS_BASE + UPRN
requests-->>parse_data: ICS or HTML response
parse_data->>parse_data: validate (reject HTML / require VCALENDAR)
parse_data->>vcal_parser: parse ICS events within 60 days
vcal_parser-->>parse_data: events -> build bins
parse_data-->>parse_data: return data dict
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #2063 +/- ##
=======================================
Coverage 86.67% 86.67%
=======================================
Files 9 9
Lines 1141 1141
=======================================
Hits 989 989
Misses 152 152 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
uk_bin_collection/uk_bin_collection/councils/DumfriesandGallowayCouncil.py (1)
51-51: 💤 Low valueDead code from Selenium removal.
driveris alwaysNoneand thefinallyblock that callsdriver.quit()is unreachable. This is leftover from the Selenium-based implementation.Proposed cleanup
def parse_data(self, page: str, **kwargs) -> dict: - driver = None try: data = {"bins": []} ... except Exception as e: print(f"An error occurred: {e}") raise - finally: - if driver: - driver.quit() return dataAlso applies to: 89-91
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@uk_bin_collection/uk_bin_collection/councils/DumfriesandGallowayCouncil.py` at line 51, Remove dead Selenium leftovers: delete the unused driver variable initialization and any finally blocks or calls to driver.quit() in the DumfriesandGallowayCouncil scraping flow (search for the variable driver and any finally blocks referencing driver.quit()). Ensure no other code expects driver to exist; if cleanup logic is still needed, replace driver.quit() calls with appropriate teardown for the current HTTP/requests-based implementation or remove them entirely in the methods where driver was previously used (e.g., inside the main scraping function and the blocks around lines showing driver initialization and the finally block).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@uk_bin_collection/uk_bin_collection/councils/DumfriesandGallowayCouncil.py`:
- Around line 36-46: The code currently falls back to options[0][0] when a
provided paon doesn't match any dropdown entry; change this to raise an explicit
exception instead: if paon is truthy and no matching val is found after both
loops, raise a ValueError (or a CouncilLookupError if you have one) with a clear
message including the paon and available option texts; keep the existing
behavior of returning the matched val when found and only raise when paon was
provided but not matched (do not change behavior when paon is falsy).
- Around line 63-72: The code silently returns an empty data when the fetched
ICS appears invalid and redundantly re-fetches the same URL; remove the local
"import requests as req" and use the module-level requests client (ics_resp),
call ics_resp.raise_for_status() then validate ics_text and if invalid raise a
descriptive exception (include URL and status) instead of returning data, and
pass the already-fetched calendar to events by calling
events(string_content=ics_text, start=now, end=future) rather than
events(ics_url,...), keeping references to ics_resp, ics_text, events(),
string_content, and data to locate changes.
---
Nitpick comments:
In `@uk_bin_collection/uk_bin_collection/councils/DumfriesandGallowayCouncil.py`:
- Line 51: Remove dead Selenium leftovers: delete the unused driver variable
initialization and any finally blocks or calls to driver.quit() in the
DumfriesandGallowayCouncil scraping flow (search for the variable driver and any
finally blocks referencing driver.quit()). Ensure no other code expects driver
to exist; if cleanup logic is still needed, replace driver.quit() calls with
appropriate teardown for the current HTTP/requests-based implementation or
remove them entirely in the methods where driver was previously used (e.g.,
inside the main scraping function and the blocks around lines showing driver
initialization and the finally block).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8cdb7591-6009-4518-b0c8-f60acaee4342
📒 Files selected for processing (1)
uk_bin_collection/uk_bin_collection/councils/DumfriesandGallowayCouncil.py
| import requests as req | ||
| ics_resp = req.get(ics_url, timeout=30) | ||
| ics_text = ics_resp.text | ||
| if "<br" in ics_text or "<html" in ics_text.lower() or "VCALENDAR" not in ics_text: | ||
| return data | ||
|
|
||
| # Get events from ICS file within the next 60 days | ||
| now = datetime.now() | ||
| future = now + timedelta(days=60) | ||
|
|
||
| # Parse ICS calendar | ||
| upcoming_events = events(ics_url, start=now, end=future) |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Does icalevents Python library support string_content parameter for events function?
💡 Result:
Yes, the icalevents Python library supports a string_content parameter for the events function. The official documentation lists the signature as: icalevents.icalevents.events(url: str | None = None, file: str | Path | None = None, string_content: bytes | str | None = None, start: datetime | None = None, end: datetime | None = None, fix_apple: bool = False, http: PoolManager | None = None, tzinfo: tzinfo | None = None, sort: bool = False, strict: bool = False) → list[Event], where string_content is described as "iCal content as string" [1]. This is corroborated by the source code in icalparser.py, which includes a parse_events function taking content: str [2]. The library's latest release is 0.3.1 as of 2025-09-26 [3][4][5].
Citations:
- 1: https://icalevents.readthedocs.io/
- 2: https://github.com/irgangla/icalevents/blob/master/icalevents/icalparser.py
- 3: https://github.com/irgangla/icalevents
- 4: https://github.com/jazzband/icalevents
- 5: https://pypi.org/project/icalevents/
🏁 Script executed:
head -n 20 uk_bin_collection/uk_bin_collection/councils/DumfriesandGallowayCouncil.pyRepository: robbrad/UKBinCollectionData
Length of output: 1969
🏁 Script executed:
sed -n '49,72p' uk_bin_collection/uk_bin_collection/councils/DumfriesandGallowayCouncil.pyRepository: robbrad/UKBinCollectionData
Length of output: 918
🏁 Script executed:
# Check if requests is imported at module level
grep -n "^import requests\|^from requests" uk_bin_collection/uk_bin_collection/councils/DumfriesandGallowayCouncil.py | head -20Repository: robbrad/UKBinCollectionData
Length of output: 89
🏁 Script executed:
grep -n "^from\|^import" uk_bin_collection/uk_bin_collection/councils/DumfriesandGallowayCouncil.py | head -30Repository: robbrad/UKBinCollectionData
Length of output: 380
Silent empty return masks errors; ICS content fetched twice.
Two issues in this segment:
-
Lines 66-67: Returning empty
datasilently when the ICS response appears invalid hides failures. Prefer raising an explicit exception to surface problems. -
Line 72:
events(ics_url, ...)re-fetches the URL already retrieved on line 64. Use thestring_contentparameter instead to avoid the redundant request.
Additionally, line 63 unnecessarily re-imports requests when it's already imported at the module level.
Proposed fix
- import requests as req
- ics_resp = req.get(ics_url, timeout=30)
+ ics_resp = requests.get(ics_url, timeout=30)
+ ics_resp.raise_for_status()
ics_text = ics_resp.text
- if "<br" in ics_text or "<html" in ics_text.lower() or "VCALENDAR" not in ics_text:
- return data
+ if "VCALENDAR" not in ics_text:
+ raise ValueError(f"Invalid ICS response for UPRN {resolved_uprn}")
now = datetime.now()
future = now + timedelta(days=60)
- upcoming_events = events(ics_url, start=now, end=future)
+ upcoming_events = events(string_content=ics_text, start=now, end=future)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@uk_bin_collection/uk_bin_collection/councils/DumfriesandGallowayCouncil.py`
around lines 63 - 72, The code silently returns an empty data when the fetched
ICS appears invalid and redundantly re-fetches the same URL; remove the local
"import requests as req" and use the module-level requests client (ics_resp),
call ics_resp.raise_for_status() then validate ics_text and if invalid raise a
descriptive exception (include URL and status) instead of returning data, and
pass the already-fetched calendar to events by calling
events(string_content=ics_text, start=now, end=future) rather than
events(ics_url,...), keeping references to ics_resp, ics_text, events(),
string_content, and data to locate changes.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@uk_bin_collection/tests/input.json`:
- Line 703: The UPRN fixture value in uk_bin_collection/tests/input.json is
incorrect (currently "137005550") and should match the PR’s validated sample
"137034556"; update the "uprn" field in that JSON fixture to "137034556" so the
backward-compatible UPRN path validation uses the intended test value (locate
the "uprn" key in the test input JSON and replace its string value).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2bbd438e-8e37-4faa-9349-86a7bf1935bf
📒 Files selected for processing (1)
uk_bin_collection/tests/input.json
UPRN is no longer required. Searches council LocalGov Drupal page by postcode to resolve UPRN from address dropdown. Backward compatible.
5eb21c8 to
78948ae
Compare
Summary
How it works
GET /waste-collection-schedule/find?postcode=X→ parse<select name="uprn">→ match by house number in option textTesting
137005550✅DG7 1AA+ paonGlenafton✅Summary by CodeRabbit