Skip to content

fix: improve database health check error reporting and correctness [Backport release/0.5.z]#2417

Merged
ctron merged 2 commits into
release/0.5.zfrom
backport-2416-to-release/0.5.z
Jun 29, 2026
Merged

fix: improve database health check error reporting and correctness [Backport release/0.5.z]#2417
ctron merged 2 commits into
release/0.5.zfrom
backport-2416-to-release/0.5.z

Conversation

@trustify-ci-bot

@trustify-ci-bot trustify-ci-bot Bot commented Jun 26, 2026

Copy link
Copy Markdown

Description

Backport of #2416 to release/0.5.z.

Summary by Sourcery

Improve reliability and observability of local health checks and keep health monitoring active when the infrastructure endpoint is disabled.

Bug Fixes:

  • Ensure the database health check reports failure if the check future terminates unexpectedly instead of silently succeeding.
  • Prevent health checks from being dropped when the infrastructure endpoint is disabled by keeping the health subsystem alive.
  • Avoid panics in the database health check by treating errors from the local check setup as a failed check instead of unwrapping the result.

@sourcery-ai

sourcery-ai Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor
Reviewer's guide (collapsed on small PRs)

Reviewer's Guide

Backport that improves health check error propagation, logging, and lifetime management so database health checks remain accurate and retained even when the infrastructure endpoint is disabled.

File-Level Changes

Change Details Files
Preserve and propagate health check error messages across threads and improve logging when a health check future terminates.
  • Convert the incoming error into a Cow<'static, str> once and reuse it instead of re-converting later
  • Clone the error into the spawned thread so it can be logged when the health check future returns
  • Change the log message when the check future ends to a warning that includes the error message
  • Store the pre-converted error directly in the Local struct instead of calling into() again
common/infrastructure/src/health/checks/local.rs
Keep health checks alive even when the infrastructure endpoint is disabled.
  • When infrastructure is disabled, keep a reference to self.health inside an infinite sleep loop to prevent registered health checks from being dropped while the server runs
common/infrastructure/src/infra.rs
Make the database health check treat failures to schedule the check as a failed health state instead of a success.
  • Replace is_ok() on the result of registering the health check with unwrap_or(false) so errors in spawning or running the health check yield a failing status instead of passing
server/src/profile/mod.rs

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • In spawn_db_check, now that unwrap_or(false) maps any error to a failed check, consider logging the underlying error before returning false so that failures during check registration/setup are observable.
  • In Infrastructure::run when the endpoint is disabled, the infinite loop using sleep(Duration::from_secs(3600)) would be clearer and easier to tune if the sleep interval were extracted into a named constant.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `spawn_db_check`, now that `unwrap_or(false)` maps any error to a failed check, consider logging the underlying error before returning `false` so that failures during check registration/setup are observable.
- In `Infrastructure::run` when the endpoint is disabled, the infinite loop using `sleep(Duration::from_secs(3600))` would be clearer and easier to tune if the sleep interval were extracted into a named constant.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

ctron and others added 2 commits June 29, 2026 08:19
Log at warn level when the health check future exits so operators can
detect a degraded state. Also fix the timeout result handling: the outer
.is_ok() discarded the inner ping result, causing a fast database
failure to be reported as healthy.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
(cherry picked from commit 4d4a6e7)
With Rust edition 2021+ disjoint capture, the async move block in the
disabled path did not capture self.health, causing the Arc<HealthChecks>
to be dropped immediately. This dropped all registered health checks
(including their Shutdown handles), killing the health check loops while
the server was still running.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
(cherry picked from commit e8784e5)
@ctron ctron force-pushed the backport-2416-to-release/0.5.z branch from a4f9d87 to 196f8fc Compare June 29, 2026 06:19
@ctron ctron enabled auto-merge June 29, 2026 06:19
@ctron ctron added this pull request to the merge queue Jun 29, 2026
Merged via the queue into release/0.5.z with commit f6858ca Jun 29, 2026
5 checks passed
@ctron ctron deleted the backport-2416-to-release/0.5.z branch June 29, 2026 11:08
@github-project-automation github-project-automation Bot moved this to Done in Trustify Jun 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

1 participant