feat: add postcode lookup for vale of glamorgan council#2066
Conversation
Users can provide either UPRN or postcode + house number. UPRN takes priority when provided (backward compatible). Uses iShareLIVE LocationSearch endpoint for postcode-to-address resolution, then matches by house number/name. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Warning Review limit reached
Your plan currently allows 2 reviews/hour. Refill in 27 minutes and 2 seconds. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more review capacity refills, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe Vale of Glamorgan Council scraper is updated to accept postcode and house_number as primary inputs. When UPRN is missing, a LocationSearch API call resolves the address, and a _match_address helper selects the correct UPRN. Collection dates are built from LocalInfo JSONP data and calendar entries with robust error handling for invalid dates. ChangesAddress resolution and collection data flow
🎯 3 (Moderate) | ⏱️ ~22 minutes
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #2066 +/- ##
=======================================
Coverage 86.67% 86.67%
=======================================
Files 9 9
Lines 1141 1141
=======================================
Hits 989 989
Misses 152 152 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
uk_bin_collection/uk_bin_collection/councils/ValeofGlamorganCouncil.py (1)
93-118: ⚡ Quick winHoist
table_headersout of the inner loop and narrow the exception.
table_headers = table.find("tr").find_all("th")is recomputed for every day cell of every row even though the headers are fixed for the table — move it out of the loop. While there, theexcept Exception: continueswallows every error silently (Ruff BLE001/S112); narrow it toValueError(the realistic failure forint(day)/datetime(...)) so genuine format changes still surface. Also note thattableis obtained via chained.find(...).find("tbody")and will raiseAttributeErrorif the council restructures the page — worth at least guarding with a clear error rather than letting it bubble as an opaque traceback.♻️ Proposed refactor
- table = soup.find("table", {"class": "TableStyle_Activities"}).find("tbody") + table_el = soup.find("table", {"class": "TableStyle_Activities"}) + if table_el is None: + raise ValueError("Could not find collections table on schedule page") + table = table_el.find("tbody") + header_row = table.find("tr") + table_headers = header_row.find_all("th") if header_row else [] + bin_type = ( + table_headers[1].text.strip().replace(" collection date", "") + if len(table_headers) > 1 else "" + ) for tr in soup.find_all("tr")[1:]: row = tr.find_all("td") month_and_year = row[0].text.split() ... for day in remove_alpha_characters(row[1].text.strip()).split(): try: bin_date = datetime(collection_year, collection_month, int(day)) - table_headers = table.find("tr").find_all("th") - collections.append( - ( - table_headers[1] - .text.strip() - .replace(" collection date", ""), - bin_date, - ) - ) - except Exception: + collections.append((bin_type, bin_date)) + except ValueError: continueBased on learnings: prefer explicit failures (raise exceptions on unexpected formats) over silent defaults or swallowed errors when parsing council bin data.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@uk_bin_collection/uk_bin_collection/councils/ValeofGlamorganCouncil.py` around lines 93 - 118, Hoist the table header lookup out of the inner day loop and tighten error handling: first guard the table retrieval (the result of soup.find("table", {"class": "TableStyle_Activities"}).find("tbody")) and raise a clear exception if it's missing instead of letting an AttributeError bubble; then compute table_headers = table.find("tr").find_all("th") once before iterating rows/days; replace the broad except Exception: continue with except ValueError: continue to only swallow parsing errors from int(day) / datetime(...) so other unexpected issues surface; keep references to table, table_headers, remove_alpha_characters, collections, and datetime to locate the affected code.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@uk_bin_collection/uk_bin_collection/councils/ValeofGlamorganCouncil.py`:
- Around line 76-87: The two outbound HTTP calls in ValeofGlamorganCouncil (the
requests.get call that fetches BASE_URL/LocalInfo and the requests.get that
fetches schedule_url) are missing timeouts; update both calls to include
timeout=30 (matching the earlier LocationSearch call) while preserving existing
params/headers and verify=False on the schedule_page fetch so the requests use a
bounded timeout and cannot hang indefinitely.
---
Nitpick comments:
In `@uk_bin_collection/uk_bin_collection/councils/ValeofGlamorganCouncil.py`:
- Around line 93-118: Hoist the table header lookup out of the inner day loop
and tighten error handling: first guard the table retrieval (the result of
soup.find("table", {"class": "TableStyle_Activities"}).find("tbody")) and raise
a clear exception if it's missing instead of letting an AttributeError bubble;
then compute table_headers = table.find("tr").find_all("th") once before
iterating rows/days; replace the broad except Exception: continue with except
ValueError: continue to only swallow parsing errors from int(day) /
datetime(...) so other unexpected issues surface; keep references to table,
table_headers, remove_alpha_characters, collections, and datetime to locate the
affected code.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ee0b755e-96dd-47f1-a051-5a706ed65494
📒 Files selected for processing (2)
uk_bin_collection/tests/input.jsonuk_bin_collection/uk_bin_collection/councils/ValeofGlamorganCouncil.py
Summary
Testing
64029020✅CF63 4AE+ paon3✅Summary by CodeRabbit