Skip to content

Conversation

@priyeshkaratha
Copy link
Contributor

@priyeshkaratha priyeshkaratha commented Jan 20, 2026

What changes were proposed in this pull request?

Refactored HealthyPipelineSafeModeRule to make it simpler and more reliable.
Earlier, it depended on events and internal state to track pipelines. Now, it directly checks the current state from the PipelineManager whenever validation is needed.

HealthyPipelineSafeModeRule.java

  • Removed event-based tracking: Eliminated processedPipelineIDs and unProcessedPipelineSet HashSets that tracked pipeline events incrementally
  • Direct validation approach: Modified validate() method to directly query PipelineManager for current open pipelines instead of relying on accumulated event state
  • New health check logic: Added isPipelineHealthy() helper method that validates:
    Pipeline has exactly 3 nodes
    All nodes are in healthy state (NodeStatus.inServiceHealthy())
    All nodes are registered with the node manager
  • Simplified event processing: The process() method now only logs events instead of maintaining state, as validation happens directly during validate() calls
  • Cleaner lifecycle management: Updated cleanup() and initializeRule() methods to remove tracking state management

SafeModeMetrics.java

  • Changed currentHealthyPipelinesCount from counter (MutableCounterLong) to gauge (MutableGaugeLong)
    Replaced incCurrentHealthyPipelinesCount() with setNumCurrentHealthyPipelines(long)
  • This aligns with the new approach where the value is set directly based on current state rather than incremented on events

Test Updates

  • TestHealthyPipelineSafeModeRule.java: Removed assertions that relied on event processing sequence; updated expected log messages
  • TestSCMSafeModeManager.java: Significant updates to handle immediate availability of healthy pipeline counts and added logic to handle edge cases where pipeline thresholds exceed available pipelines

What is the link to the Apache JIRA

HDDS-13590

How was this patch tested?

Tested using existing testcases written for safemode validation rules and Safemode Manager.
Validated using forked CI.

@priyeshkaratha priyeshkaratha marked this pull request as ready for review January 20, 2026 10:17
@ChenSammi ChenSammi self-requested a review January 21, 2026 03:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant