feat: add inverclyde council scraper#2094
Conversation
Parses council's street-sorted PDF (pdfplumber) and computes fortnightly collection dates. Handles house number ranges and split recycling/residual days. Matches street name from address or Nominatim geocode fallback. Pure HTTP - no Selenium needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Warning Review limit reached
Your plan currently allows 2 reviews/hour. Refill in 18 minutes and 35 seconds. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more review capacity refills, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds a complete bin collection scraper for Inverclyde Council. The scraper downloads a static PDF of street uplift days, parses street/town/detail rows, resolves postcodes to streets via Nominatim with fallbacks, matches rows by street-name similarity and optional house-number logic, computes upcoming collection dates using a fortnightly recycling calendar, and returns sorted bin entries with type variants and per-address overrides. ChangesInverclyde Council Scraper Implementation
Sequence DiagramsequenceDiagram
participant ParseData as parse_data()
participant PDF as parse_pdf_data()
participant Nominatim as Nominatim API
participant RowMatch as find_best_matching_row()
participant DateGen as next_dates_for_bin_types()
participant Output as bin entries dict
ParseData->>PDF: download and parse PDF
PDF-->>ParseData: rows with street/town/detail
ParseData->>Nominatim: resolve postcode to street
alt Nominatim success
Nominatim-->>ParseData: street name
else Nominatim fallback
ParseData->>Nominatim: resolve paon+postcode
Nominatim-->>ParseData: street name or use paon as fallback
end
ParseData->>RowMatch: find_best_matching_row(street, paon, rows)
RowMatch-->>ParseData: matched row with detail/day/calendar
ParseData->>DateGen: compute bin dates for week pattern
DateGen-->>ParseData: initial bin entry list
ParseData->>ParseData: apply per-address recycling overrides from Detail
ParseData->>ParseData: filter to future dates and sort
ParseData-->>Output: return sorted bins dict
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #2094 +/- ##
=======================================
Coverage 86.67% 86.67%
=======================================
Files 9 9
Lines 1141 1141
=======================================
Hits 989 989
Misses 152 152 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
uk_bin_collection/uk_bin_collection/councils/InverclydeCouncil.py (1)
91-96: 💤 Low valueConsider capturing
datetime.now()once to avoid subtle midnight-boundary inconsistency.Lines 91 and 94 call
datetime.now()separately. If execution spans midnight,todaycould be the previous date whilehourreads from the new day, potentially producing off-by-one behavior. The downstream>= todayfilter inparse_datamitigates this, but capturing the timestamp once is cleaner.♻️ Suggested fix
+ now = datetime.now() + today = now.date() - today = datetime.now().date() # Find the next occurrence of this weekday (or today if it matches) days_ahead = (day_idx - today.weekday()) % 7 - if days_ahead == 0 and datetime.now().hour >= 19: + if days_ahead == 0 and now.hour >= 19: days_ahead = 7🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@uk_bin_collection/uk_bin_collection/councils/InverclydeCouncil.py` around lines 91 - 96, The code computes today and then re-calls datetime.now() for the hour check, risking a midnight boundary bug; capture the current timestamp once (e.g., assign now = datetime.now()), use now.date() for today and now.hour for the 19-hour comparison, and then compute days_ahead and next_day using those single timestamp-derived values (affecting the variables today, days_ahead, and next_day).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@uk_bin_collection/tests/input.json`:
- Line 882: The string value for the "wiki_note" key contains a mojibake
character in "property�you"; update the "wiki_note" value to replace the invalid
character with a normal separator (for example "property; you" or "property -
you") so the note reads correctly for users, ensuring you only change the
separator and keep the rest of the text (including the UPRN placeholder and URL)
intact.
---
Nitpick comments:
In `@uk_bin_collection/uk_bin_collection/councils/InverclydeCouncil.py`:
- Around line 91-96: The code computes today and then re-calls datetime.now()
for the hour check, risking a midnight boundary bug; capture the current
timestamp once (e.g., assign now = datetime.now()), use now.date() for today and
now.hour for the 19-hour comparison, and then compute days_ahead and next_day
using those single timestamp-derived values (affecting the variables today,
days_ahead, and next_day).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 70d899f1-98e4-49fc-a082-ec9e02e6578a
📒 Files selected for processing (2)
uk_bin_collection/tests/input.jsonuk_bin_collection/uk_bin_collection/councils/InverclydeCouncil.py
Summary
Notes
Test plan
Summary by CodeRabbit