Skip to content

Conversation

@dik654
Copy link

@dik654 dik654 commented Dec 17, 2025

Closes #8698

Motivation

Infrastructure operators want to easily see which validator indices are connected to their beacon node at a glance.

Description

  • Add GET /eth/v1/debug/monitored_validators API endpoint that returns array of validator indices currently being monitored
  • Add info-level logs when validators register/unregister from the monitor, including the full list of monitored indices:
    • Validator registered to monitor index=X, total=Y, indices=0,1,2,...
    • Validator removed from monitor index=X, total=Y, indices=0,1,3,...

Usage

API endpoint:

curl http://localhost:9596/eth/v1/debug/monitored_validators
# Response: {"data":[0,1,2,3,4,5,6,7]}

For dashboard integration (e.g., Grafana), you can use the JSON API datasource plugin to poll this endpoint and display the validator indices.

Design decisions

  • Used debug namespace instead of lodestar because it's enabled by default (no need for --rest.namespace all)
  • Used API + logs approach instead of metrics to avoid cardinality issues with validator index labels
  • Logs include full indices list so operators can see the complete state at each change without calling the API
image image

link to issue

Closes #8698

AI Assistance Disclosure

Used Claude Code to assist with implementation and code exploration.

@dik654 dik654 requested a review from a team as a code owner December 17, 2025 02:58
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @dik654, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new API endpoint and improved logging functionality to enhance the visibility for infrastructure operators regarding the validator indices actively monitored by their beacon node. The changes aim to provide an immediate overview of connected validators, utilizing a debug API namespace and logging to avoid potential cardinality issues associated with metrics for this type of data.

Highlights

  • New API Endpoint: A new GET /eth/v1/debug/monitored_validators API endpoint has been added. This endpoint allows infrastructure operators to retrieve an array of validator indices currently being monitored by their beacon node.
  • Enhanced Logging for Validator Monitoring: Info-level logs are now generated when validators register or unregister from the monitor. These logs include the specific validator index, the total count of monitored validators, and a comma-separated list of all currently monitored indices, providing a real-time snapshot of the monitoring state.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new API endpoint to list monitored validator indices and adds info-level logs when validators are registered or removed from the monitor. The changes are well-implemented and the new API endpoint is a useful addition for operators. My main feedback concerns the new logging feature. While logging the full list of indices is a deliberate design choice to provide state visibility, it could lead to performance degradation and excessively large log entries when a large number of validators are being monitored. I've provided suggestions to truncate the list of indices in the log messages to prevent potential operational issues with log ingestion and storage, while still retaining the benefit of seeing a snapshot of the monitored validators.

Comment on lines 318 to 323
logger.info("Validator registered to monitor", {
index,
total: validators.size,
indices: Array.from(validators.keys()).join(","),
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Logging all monitored validator indices on every registration can lead to performance issues and very large log entries, especially when monitoring a large number of validators. This can be problematic for log ingestion and analysis systems, and it might consume significant disk space if registrations are frequent.

Given that the total count is already logged and there's a new API endpoint to fetch the full list, consider truncating the list of indices in the log message if it exceeds a certain length. This would provide a balance between providing context in logs and avoiding performance/operational issues.

        const keys = Array.from(validators.keys());
        logger.info("Validator registered to monitor", {
          index,
          total: validators.size,
          indices: keys.length > 100 ? `${keys.slice(0, 100).join(",")},...` : keys.join(","),
        });

Comment on lines 688 to 694
logger.info("Validator removed from monitor", {
index,
total: validators.size,
indices: validators.size > 0 ? Array.from(validators.keys()).join(",") : "",
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the registration logic, logging the full list of validator indices upon removal can cause performance and logging issues when many validators are monitored. It's better to truncate the list to keep log messages manageable. This still provides a snapshot of the monitored validators without overwhelming logging systems.

          const keys = Array.from(validators.keys());
          logger.info("Validator removed from monitor", {
            index,
            total: validators.size,
            indices: keys.length > 100 ? `${keys.slice(0, 100).join(",")},...` : keys.join(","),
          });

@dik654 dik654 force-pushed the feat/monitored-validator-indices branch from 3eee5ae to db3f26b Compare December 17, 2025 03:04
validators.getOrDefault(index).lastRegisteredTimeMs = Date.now();
if (isNewValidator) {
const keys = Array.from(validators.keys());
logger.info("Validator registered to monitor", {
Copy link
Member

@nflaig nflaig Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will be way to verbose for nodes that have a lot of validators connected, if we want to print something like this in the logs it should be done each epoch (or even less often)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify the conflicting feedback:
@matthewkeil suggested logging on startup or when validator indices change
@nflaig raised a concern about verbosity for nodes with many validators

I’ll go with epoch-based logging as @nflaig suggested — this keeps logs clean while still providing visibility.

Let me know if there are any objections.

Copy link
Member

@nflaig nflaig Dec 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would refer to @chiemerieezechukwu what's the preferred method to consume this, I think a metric would be nice too but we'd need to think how we can update it more frequently as 1h seems to be still to long as per #8702 (comment)

Copy link
Author

@dik654 dik654 Dec 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, sorry for the late response. I needed some time to fully understand the context and think through the options.

After carefully reviewing the discussion again, it seems the core issue is that validator_monitor_validators metric doesn't update quickly when validators disconnect:

"Doesn't seem that it is updated regularly though. When I move validators away from beacons, this number doesn't decrease it seems" - @chiemerieezechukwu

If I understand correctly, this is due to RETAIN_REGISTERED_VALIDATORS_MS being set to 1 hour.

@nflaig I was wondering if either of these approaches would be acceptable?

  1. Reduce RETAIN_REGISTERED_VALIDATORS_MS from 1 hour to 2-3 epochs (~13-20 min) - I saw you reduced it from 12h to 1h in chore: reduce time to retain registered validators in monitor to 1 hour #7668, so I wanted to check if further reduction might cause any issues. This would make the existing metric update faster.

  2. Add epoch-based logging with active/inactive count based on lastRegisteredTimeMs - Keep the existing 1-hour prune logic as is, but log the count of validators registered within the last 2-3 epochs each epoch. This provides faster feedback via logs without changing the existing metric behavior. (Though this probably isn't what @chiemerieezechukwu wants since they mentioned Grafana dashboard)

Once we settle on that, I'd be happy to follow up with @chiemerieezechukwu about the "nice to have" feature for showing which validators (indices or pubkey) are connected.☺️

},
},
getMonitoredValidatorIndices: {
url: "/eth/v1/debug/monitored_validators",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chiemerieezechukwu how do you want to consume this, I would think a metric is easier to work with than exposing an api

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chiemerieezechukwu how do you want to consume this, I would think a metric is easier to work with than exposing an api

Mainly on a Grafana dashboard. Would be nice to see at any point the number of validators connected to a beacon. Getting which validators (the indices or pubkey) would be a "nice to have"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to see at any point the number of validators connected to a beacon.

you can already see that on the summary dashboard

image

or validator monitor dashboard

image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to see at any point the number of validators connected to a beacon.

you can already see that on the summary dashboard

image or validator monitor dashboard image

Doesn't seem that it is updated regularly though. When I move validators away from beacons, this number doesn't decrease it seems

Copy link
Contributor

@chiemerieezechukwu chiemerieezechukwu Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I might have been dealing with a scenario where the response is cached and not updated on the dashboard
image

Copy link
Member

@nflaig nflaig Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's because we cache this internally for 1h and only then if we haven't seen a validator we stop tracking them

fun fact, this was previously 12h until I reduced it here #7668

if we need a faster feedback it might make sense to have a metric tracked somewhere else

* Returns the validator indices that are currently being monitored by the validator monitor.
* These are validators that have registered with this beacon node via the validator API.
*/
getMonitoredValidatorIndices: Endpoint<
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would be a custom route, non standardized in the spec, and has to be put into /lodestar namespace

@matthewkeil
Copy link
Member

matthewkeil commented Jan 6, 2026

Pinging @0xmrree from this PR for visibility

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dashboard Update - Display which validator indices are connected to a running node

4 participants