fix: CalderdaleCouncil - rewrite as pure requests#2058
Conversation
|
Warning Review limit reached
Your plan currently allows 2 reviews/hour. Refill in 24 minutes and 46 seconds. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more review capacity refills, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughCalderdaleCouncil's collection parser was migrated from Selenium web driver automation to HTTP requests with BeautifulSoup parsing. The implementation validates address lookups via two sequential POST requests, extracts bin collection dates from HTML tables, and sorts results by collection date. Configuration entries were updated to reflect the removal of web driver requirements and related metadata changes. ChangesCalderdale Council Parser Migration: Selenium to HTTP Requests
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
❌ 1 Tests Failed:
View the top 1 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
uk_bin_collection/uk_bin_collection/councils/CalderdaleCouncil.py (2)
1-1: ⚡ Quick winRemove unused import.
The
remodule is imported but never used in this file.♻️ Proposed cleanup
-import re from datetime import datetime🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@uk_bin_collection/uk_bin_collection/councils/CalderdaleCouncil.py` at line 1, Remove the unused top-level import of the re module in CalderdaleCouncil.py: delete the line importing "re" (the unused symbol `re`) so the file no longer contains an unnecessary import.
98-99: ⚡ Quick winConsider explicit error handling for date parsing failures.
The
ValueErrorexception is caught and silently ignored when date parsing fails. This could hide format changes from the council website or data quality issues. Based on learnings, explicit failures help detect format changes early.Consider logging the error or raising an exception with context about which row/bin type failed to parse.
♻️ Proposed improvement
Option 1: Log and continue (if some rows may legitimately lack dates):
try: parsed = datetime.strptime(date_text, "%A %d %B %Y") data["bins"].append( { "type": bin_type, "collectionDate": parsed.strftime(date_format), } ) except ValueError: + # Log unexpected date format for monitoring + print(f"Warning: Could not parse date '{date_text}' for bin type '{bin_type}'") continueOption 2: Raise with context (if all rows should have valid dates):
try: parsed = datetime.strptime(date_text, "%A %d %B %Y") data["bins"].append( { "type": bin_type, "collectionDate": parsed.strftime(date_format), } ) except ValueError: - continue + raise ValueError( + f"Failed to parse date '{date_text}' for bin type '{bin_type}'. " + f"Expected format: 'Day DD Month YYYY' (e.g., 'Monday 12 May 2026')" + )Based on learnings: Functions handling parsing should validate formats explicitly to detect unexpected changes early.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@uk_bin_collection/uk_bin_collection/councils/CalderdaleCouncil.py` around lines 98 - 99, The silent except ValueError: continue in CalderdaleCouncil.py hides date parsing failures; update that block (inside the CalderdaleCouncil class/method that parses row dates) to either log the error with context (include the offending date string, row data and bin type) and then continue, or raise a new ValueError with a descriptive message so failures surface; use the module logger or the class logger (e.g., self.logger or logging.getLogger(__name__)) and include the original exception details when logging/raising to aid debugging.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@uk_bin_collection/tests/input.json`:
- Line 881: The "wiki_note" value contains a Unicode replacement character (�);
update the string for the "wiki_note" field to replace the U+FFFD with the
intended character (e.g., an em dash — or a regular hyphen -) so the text reads
correctly ("...Replace the XXXXXXXXXX with the UPRN of your property — you can
use..."); ensure the file is saved in UTF-8 encoding after making the
replacement so the character is preserved.
---
Nitpick comments:
In `@uk_bin_collection/uk_bin_collection/councils/CalderdaleCouncil.py`:
- Line 1: Remove the unused top-level import of the re module in
CalderdaleCouncil.py: delete the line importing "re" (the unused symbol `re`) so
the file no longer contains an unnecessary import.
- Around line 98-99: The silent except ValueError: continue in
CalderdaleCouncil.py hides date parsing failures; update that block (inside the
CalderdaleCouncil class/method that parses row dates) to either log the error
with context (include the offending date string, row data and bin type) and then
continue, or raise a new ValueError with a descriptive message so failures
surface; use the module logger or the class logger (e.g., self.logger or
logging.getLogger(__name__)) and include the original exception details when
logging/raising to aid debugging.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f3adcab2-453e-4077-8b0f-82ae464e3798
📒 Files selected for processing (2)
uk_bin_collection/tests/input.jsonuk_bin_collection/uk_bin_collection/councils/CalderdaleCouncil.py
| "wiki_command_url_override": "https://environmentfirst.co.uk/house.php?uprn=XXXXXXXXXX", | ||
| "wiki_name": "Environment First", | ||
| "wiki_note": "For properties with collections managed by Environment First, such as Lewes and Eastbourne. Replace the XXXXXXXXXX with the UPRN of your property\u2014you can use [FindMyAddress](https://www.findmyaddress.co.uk/search) to find this." | ||
| "wiki_note": "For properties with collections managed by Environment First, such as Lewes and Eastbourne. Replace the XXXXXXXXXX with the UPRN of your property�you can use [FindMyAddress](https://www.findmyaddress.co.uk/search) to find this." |
There was a problem hiding this comment.
Fix the encoding corruption in wiki_note.
The wiki_note field contains a replacement character � (U+FFFD), indicating the original character was lost during encoding. This will display incorrectly to users.
🔧 Proposed fix
The character should likely be an em dash (—) or regular dash (-):
- "wiki_note": "For properties with collections managed by Environment First, such as Lewes and Eastbourne. Replace the XXXXXXXXXX with the UPRN of your property�you can use [FindMyAddress](https://www.findmyaddress.co.uk/search) to find this."
+ "wiki_note": "For properties with collections managed by Environment First, such as Lewes and Eastbourne. Replace the XXXXXXXXXX with the UPRN of your property—you can use [FindMyAddress](https://www.findmyaddress.co.uk/search) to find this."📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "wiki_note": "For properties with collections managed by Environment First, such as Lewes and Eastbourne. Replace the XXXXXXXXXX with the UPRN of your property�you can use [FindMyAddress](https://www.findmyaddress.co.uk/search) to find this." | |
| "wiki_note": "For properties with collections managed by Environment First, such as Lewes and Eastbourne. Replace the XXXXXXXXXX with the UPRN of your property—you can use [FindMyAddress](https://www.findmyaddress.co.uk/search) to find this." |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@uk_bin_collection/tests/input.json` at line 881, The "wiki_note" value
contains a Unicode replacement character (�); update the string for the
"wiki_note" field to replace the U+FFFD with the intended character (e.g., an em
dash — or a regular hyphen -) so the text reads correctly ("...Replace the
XXXXXXXXXX with the UPRN of your property — you can use..."); ensure the file is
saved in UTF-8 encoding after making the replacement so the character is
preserved.
27f3b85 to
54024c8
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
uk_bin_collection/uk_bin_collection/councils/CalderdaleCouncil.py (2)
1-1: 💤 Low valueRemove unused
reimport.The
remodule is imported but never used in the code.🧹 Proposed fix
-import re from datetime import datetime🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@uk_bin_collection/uk_bin_collection/councils/CalderdaleCouncil.py` at line 1, The module imports the unused re symbol which should be removed to avoid linter warnings; edit CalderdaleCouncil.py and delete the top-level "import re" statement (or remove re from the import list if combined with others) so only actually used modules remain, then run tests/linter to confirm no remaining references to re in the file.
90-99: ⚡ Quick winConsider raising on unexpected date format instead of silent continue.
If
strptimefails, the code silently continues, which could mask upstream format changes from the council. If the date format changes, this scraper would return partial or empty data without any indication of why.Based on learnings: "prefer explicit failures (raise exceptions on unexpected formats) over silent defaults or swallowed errors. This ensures format changes are detected early."
🔍 Proposed fix to fail explicitly
try: parsed = datetime.strptime(date_text, "%A %d %B %Y") data["bins"].append( { "type": bin_type, "collectionDate": parsed.strftime(date_format), } ) except ValueError: - continue + raise ValueError( + f"Unable to parse date '{date_text}' for bin type '{bin_type}'" + )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@uk_bin_collection/uk_bin_collection/councils/CalderdaleCouncil.py` around lines 90 - 99, The current try/except in CalderdaleCouncil that parses date_text (where parsed = datetime.strptime(date_text, "%A %d %B %Y") and appends to data["bins"]) silently continues on ValueError; replace the silent continue with an explicit raise (or re-raise) including contextual info (date_text, bin_type and the expected format) so failures surface (e.g., raise ValueError with a message like "Unexpected date format for {date_text} when parsing bins of type {bin_type}; expected '%A %d %B %Y'") so upstream callers detect format changes early.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@uk_bin_collection/uk_bin_collection/councils/CalderdaleCouncil.py`:
- Around line 28-36: The two HTTP POST calls using session.post (the one that
sets data with "postcode" and the other between lines 53-64) do not include a
timeout, so the scraper can hang; update both session.post invocations to pass a
timeout argument (e.g., timeout=10 or a module-level constant like
DEFAULT_TIMEOUT) so requests will fail fast, and keep the existing
resp.raise_for_status() logic unchanged to handle HTTP errors.
---
Nitpick comments:
In `@uk_bin_collection/uk_bin_collection/councils/CalderdaleCouncil.py`:
- Line 1: The module imports the unused re symbol which should be removed to
avoid linter warnings; edit CalderdaleCouncil.py and delete the top-level
"import re" statement (or remove re from the import list if combined with
others) so only actually used modules remain, then run tests/linter to confirm
no remaining references to re in the file.
- Around line 90-99: The current try/except in CalderdaleCouncil that parses
date_text (where parsed = datetime.strptime(date_text, "%A %d %B %Y") and
appends to data["bins"]) silently continues on ValueError; replace the silent
continue with an explicit raise (or re-raise) including contextual info
(date_text, bin_type and the expected format) so failures surface (e.g., raise
ValueError with a message like "Unexpected date format for {date_text} when
parsing bins of type {bin_type}; expected '%A %d %B %Y'") so upstream callers
detect format changes early.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8a98ec4d-2fdc-4102-8b6c-fec2ee398bad
📒 Files selected for processing (2)
uk_bin_collection/tests/input.jsonuk_bin_collection/uk_bin_collection/councils/CalderdaleCouncil.py
🚧 Files skipped from review as they are similar to previous changes (1)
- uk_bin_collection/tests/input.json
Summary
.zfill(12)) since the council uses 12-digit padded UPRNsweb_driverfrom input.json as Selenium is no longer neededTesting
Summary by CodeRabbit
Refactor
Chores