Skip to content

Conversation

@kaisheng-hua
Copy link

@kaisheng-hua kaisheng-hua commented Dec 23, 2025

Pre-submission checklist

[✔ ] - I've ran the linters locally and fixed lint errors related to the files I modified in this PR. You can install the linters by running pip install -r requirements-dev.txt && pre-commit install

[✔ ] - pre-commit run

[INFO]clang-format.........................................(no files to check)Skipped
black................................................(no files to check)Skipped
shellcheck...........................................(no files to check)Skipped
shfmt................................................(no files to check)Skipped
trim trailing whitespace.............................(no files to check)Skipped
fix end of files.....................................(no files to check)Skipped
check yaml...........................................(no files to check)Skipped
check json...........................................(no files to check)Skipped
check for merge conflicts............................(no files to check)Skipped
ruff check...........................................(no files to check)Skipped

Summary

When running the AgentNeighborTest/1.AddPendingEntry case on the Ladahk800bcls, observed a consistent test failures like this:

F1005 08:05:37.150961 254013 AgentHwTest.cpp:64] Failed to create initial config: HwSwitchMatcher::switchId api must be called only when there is a single switchId

Root Cause Analysis

The issue was due to the Ladakh800bcls platform supporting multiple NPUs (Network Processing Units).

The following code resolves a scope from a logical port, resulting in a HwSwitchMatcher object that contains the switch IDs associated with that port:

auto scopeMatcher = ensemble.getSw()
->getScopeResolver()
->scope(ensemble.masterLogicalPortIds()[0]);
On this multi-NPU platform, scopeMatcher contained two switch IDs.

However, the subsequent call to .switchId() attempts to get a single ID from this object. The implementation of the HwSwitchMatcher::switchId() method has a strict check that requires exactly one ID:

SwitchID HwSwitchMatcher::switchId() const {
// This check fails if the number of IDs is not equal to 1
if (switchIds_.size() != 1) {
throw FbossError(
"HwSwitchMatcher::switchId api must be called only when there is a single switchId");
}
return *switchIds_.begin();
}
Because the scopeMatcher object contained two IDs, the condition switchIds_.size() != 1 was true, causing the function to throw an FbossError.

Solution

The solution was to modify the code to resolve the scope for a single port (by indexing the first port, masterLogicalPortIds()[0]) instead of the entire list of ports. This ensures that the HwSwitchMatcher contains just one switch ID, satisfying the precondition of the .switchId() method and allowing the tests to proceed correctly on multi-NPU platforms.

Test Plan

[ OK ] AgentNeighborTest/1.AddPendingEntry (32677 ms)
[----------] 1 test from AgentNeighborTest/1 (32677 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (32677 ms total)
[ PASSED ] 1 test.

@meta-cla meta-cla bot added the CLA Signed label Dec 23, 2025
@meta-codesync
Copy link

meta-codesync bot commented Dec 23, 2025

@mikechoifb has imported this pull request. If you are a Meta employee, you can view this in D89735288.

@kaisheng-hua kaisheng-hua changed the title Ladahk800bcls: fix bug for AgentMacLearningStaticConfigTest.VerifyStaticMacEntriesFromConfig Ladahk800bcls: fix bug for AgentNeighborTest Dec 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant