Skip to content

fix(NewarkAndSherwoodDC): Add postcode search to resolve correct UPRN#2056

Open
InertiaUK wants to merge 2 commits into
robbrad:masterfrom
InertiaUK:fix-newark-postcode-lookup
Open

fix(NewarkAndSherwoodDC): Add postcode search to resolve correct UPRN#2056
InertiaUK wants to merge 2 commits into
robbrad:masterfrom
InertiaUK:fix-newark-postcode-lookup

Conversation

@InertiaUK
Copy link
Copy Markdown
Contributor

@InertiaUK InertiaUK commented May 12, 2026

Summary

  • API-resolved UPRNs don't match council's internal IDs — calendar returns empty page
  • Added ASP.NET WebForms postcode search via __EVENTTARGET POST to resolve correct pid from council's own address list
  • Then fetches calendar using the resolved pid. Pure requests, no Selenium

Test

  • Postcode: NG22 9PL
  • UPRN: 200004258529

Summary by CodeRabbit

  • New Features

    • Newark and Sherwood District Council: postcode-based address lookup added as an alternative to UPRN.
    • Improved address selection when multiple results are returned.
    • Collection calendars now restrict entries to the next eight weeks.
  • Bug Fixes

    • More robust parsing and error handling for invalid or malformed date entries.
    • Correctly detects and skips cancelled bin collections.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 12, 2026

Warning

Review limit reached

@InertiaUK, we couldn't start this review because you've used your available PR reviews for now.

Your plan currently allows 2 reviews/hour. Refill in 27 minutes and 1 second.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more review capacity refills, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4e3ad776-0710-4a61-a90f-a158216cbea8

📥 Commits

Reviewing files that changed from the base of the PR and between f890a09 and 28df3f0.

📒 Files selected for processing (1)
  • uk_bin_collection/uk_bin_collection/councils/NewarkAndSherwoodDC.py
📝 Walkthrough

Walkthrough

parse_data now requires postcode (optionally paon/uprn), performs an ASP.NET tokenized postcode search to find an address PID, then fetches the calendar for that PID and defensively parses upcoming collection dates within eight weeks, skipping cancelled or malformed entries.

Changes

Address search and calendar parsing refactor

Layer / File(s) Summary
Dependencies and imports
uk_bin_collection/uk_bin_collection/councils/NewarkAndSherwoodDC.py
Module imports updated to add re, requests, and BeautifulSoup alongside existing datetime imports.
Postcode-based ASP.NET search and address selection
uk_bin_collection/uk_bin_collection/councils/NewarkAndSherwoodDC.py
parse_data reads postcode, paon, and uprn; creates a requests.Session with browser-like headers; GETs the site root to extract ASP.NET tokens; POSTs a search form; extracts collection.aspx?pid=... links; selects the best PID by exact UPRN, then PAON prefix/substring, else first; raises ValueError if none found.
Calendar retrieval and defensive date parsing
uk_bin_collection/uk_bin_collection/councils/NewarkAndSherwoodDC.py
Fetches calendar page for chosen pid; iterates month tables, removes header rows, parses comma-delimited bin rows with try/except date parsing, filters dates to next eight weeks, skips cancelled entries (case-insensitive), and raises ValueError when no collections parsed.

Sequence Diagram

sequenceDiagram
  participant Client
  participant NewarkSite as Newark & Sherwood Site
  participant AddressParser
  Client->>NewarkSite: GET root (extract __VIEWSTATE, __VIEWSTATEGENERATOR, __EVENTVALIDATION)
  NewarkSite-->>Client: HTML with tokens
  Client->>NewarkSite: POST search with postcode and tokens
  NewarkSite-->>Client: HTML with address list (collection.aspx?pid=...)
  AddressParser->>AddressParser: Extract PIDs and link list
  AddressParser->>AddressParser: Select best PID (exact UPRN -> PAON match -> first)
  AddressParser-->>Client: selected PID
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • dp247

Poem

🐰 With postcode in paw I tap the site gate,
Tokens and posts help the addresses mate.
I nibble through months for the eight-week parade,
Skipping cancelled crumbs where the bins stayed. 🗓️🧺

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(NewarkAndSherwoodDC): Add postcode search to resolve correct UPRN' directly and clearly summarizes the main change: adding postcode search functionality to correctly resolve the UPRN for the Newark and Sherwood council. It accurately reflects the primary purpose of the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.67%. Comparing base (8ecf878) to head (28df3f0).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2056   +/-   ##
=======================================
  Coverage   86.67%   86.67%           
=======================================
  Files           9        9           
  Lines        1141     1141           
=======================================
  Hits          989      989           
  Misses        152      152           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
uk_bin_collection/uk_bin_collection/councils/NewarkAndSherwoodDC.py (1)

29-29: 💤 Low value

Consider documenting verify=False or suppressing the urllib3 warning.

Disabling SSL verification (verify=False) is used throughout this file. While this may be necessary due to council site certificate issues, it could emit InsecureRequestWarning logs. Consider either:

  1. Adding a comment explaining why verification is disabled, or
  2. Suppressing the warning locally if it clutters logs
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@uk_bin_collection/uk_bin_collection/councils/NewarkAndSherwoodDC.py` at line
29, The request call s.get(..., verify=False) in NewArkAndSherwoodDC.py disables
SSL verification and can trigger urllib3 InsecureRequestWarning; either add a
brief inline comment by the verify=False usage explaining why verification is
disabled (e.g., council site cert issues) or, to suppress noisy warnings
globally for this module, import urllib3 and call
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning) near the top
of the file (ensure import urllib3 is added) so the warning is not emitted while
keeping the verify=False behavior for the s.get call.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@uk_bin_collection/uk_bin_collection/councils/NewarkAndSherwoodDC.py`:
- Around line 33-35: The three statements that extract hidden form tokens
(viewstate, viewstate_gen, event_val) use soup1.find(...)[ "value" ] and will
produce a cryptic TypeError if an input is missing; update the code around the
soup1.find calls (the extraction of __VIEWSTATE, __VIEWSTATEGENERATOR,
__EVENTVALIDATION) to check that each find returned a non-None element and that
the "value" attribute exists, and if any are missing raise a clear, explicit
exception (e.g., ValueError or RuntimeError) with a descriptive message
indicating which token is missing so failures are obvious and debuggable.

---

Nitpick comments:
In `@uk_bin_collection/uk_bin_collection/councils/NewarkAndSherwoodDC.py`:
- Line 29: The request call s.get(..., verify=False) in NewArkAndSherwoodDC.py
disables SSL verification and can trigger urllib3 InsecureRequestWarning; either
add a brief inline comment by the verify=False usage explaining why verification
is disabled (e.g., council site cert issues) or, to suppress noisy warnings
globally for this module, import urllib3 and call
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning) near the top
of the file (ensure import urllib3 is added) so the warning is not emitted while
keeping the verify=False behavior for the s.get call.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9a57e4ee-babe-4218-ba96-59851d3eac63

📥 Commits

Reviewing files that changed from the base of the PR and between 8ecf878 and 5e575ab.

📒 Files selected for processing (1)
  • uk_bin_collection/uk_bin_collection/councils/NewarkAndSherwoodDC.py

Comment thread uk_bin_collection/uk_bin_collection/councils/NewarkAndSherwoodDC.py Outdated
@InertiaUK InertiaUK force-pushed the fix-newark-postcode-lookup branch from 5e575ab to f890a09 Compare May 12, 2026 15:24
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
uk_bin_collection/uk_bin_collection/councils/NewarkAndSherwoodDC.py (1)

29-29: 💤 Low value

SSL certificate verification is disabled.

verify=False disables certificate validation on lines 29, 51, and 100. While this may be necessary due to certificate issues on the council's server, it exposes the connection to potential MITM attacks.

If certificate issues are temporary or fixable, consider enabling verification. Otherwise, document why it's disabled.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@uk_bin_collection/uk_bin_collection/councils/NewarkAndSherwoodDC.py` at line
29, The requests in NewarkAndSherwoodDC.py call s.get(...) with verify=False
(e.g., the call setting r1 = s.get(f"{base}/", verify=False)), which disables
SSL certificate validation; change these calls to enable verification by
removing verify=False or by wiring a configurable flag (e.g., ssl_verify or
REQUESTS_VERIFY) into the session usage so verification can be toggled safely,
and if you must keep verification disabled add a clear comment and rationale
near the session/request creation explaining why and how long it should remain
disabled.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@uk_bin_collection/uk_bin_collection/councils/NewarkAndSherwoodDC.py`:
- Around line 75-96: The selection code currently falls back to address_links[0]
even when a user supplied user_uprn or user_paon but no match was found; change
the logic in the selection block (variables: selected, address_links, user_uprn,
user_paon) so that if user_uprn or user_paon was provided and no matching addr
is found you raise a clear exception (e.g., ValueError or a custom
AddressNotFound/LookupError) with a message that includes the supplied
uprn/paon, instead of silently assigning address_links[0]; only use the first
result as a silent fallback when neither user_uprn nor user_paon was provided.

---

Nitpick comments:
In `@uk_bin_collection/uk_bin_collection/councils/NewarkAndSherwoodDC.py`:
- Line 29: The requests in NewarkAndSherwoodDC.py call s.get(...) with
verify=False (e.g., the call setting r1 = s.get(f"{base}/", verify=False)),
which disables SSL certificate validation; change these calls to enable
verification by removing verify=False or by wiring a configurable flag (e.g.,
ssl_verify or REQUESTS_VERIFY) into the session usage so verification can be
toggled safely, and if you must keep verification disabled add a clear comment
and rationale near the session/request creation explaining why and how long it
should remain disabled.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 80ed9fab-7ab0-464f-9caa-27f101800b63

📥 Commits

Reviewing files that changed from the base of the PR and between 5e575ab and f890a09.

📒 Files selected for processing (1)
  • uk_bin_collection/uk_bin_collection/councils/NewarkAndSherwoodDC.py

Comment thread uk_bin_collection/uk_bin_collection/councils/NewarkAndSherwoodDC.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant