Add N9300 switch memory check#370
Conversation
Harinadh-Saladi
left a comment
There was a problem hiding this comment.
Pls address the comments
|
@Priyanka-Patil14 please attach test result from APIC for possible scenario's |
|
Uploaded the logs. |
|
@Priyanka-Patil14 can you please provide updated script output from APIC and error handling cases? |
MemoryCheck_Logs.txt Uploaded the logs. Please review it. |
|
Attached pytest and full script run logs as requested. |
bac2092 to
8197e54
Compare
…n required (datacenter#355) * add `--max-threads` arg * fix bad descriptor errs/race conditions * update pytests
… cscwh68103 invalid fabricpathep targets (datacenter#357) * specific testing for known failure conditions of cscwh68103 as to not catch valid scenarios
8197e54 to
d7d4289
Compare
|
Attaching the full run logs. NewValidation_APIC_FullRun_Logs.txt |
lovkeshsharma702
left a comment
There was a problem hiding this comment.
intigrated test completed with no fault.
|
|
||
| Impact: Running an N9K-C93180YC-FX3 switch with less than 32GB memory can lead to memory pressure and increase the risk of service instability. | ||
|
|
||
| If any N9K-C93180YC-FX3 switch is flagged by this check, upgrade the switch memory to at least 32GB. |
There was a problem hiding this comment.
Dig into the TME doc on this one, we should align messaging on script to match external docs, not introduce contradictions to obfuscation.
if doc needs to be made clear to address on proactive failure notice, then do it.
goal should be to link that doc or bug back to the check.
There was a problem hiding this comment.
reference back to CSCwm42741
| result = NA | ||
| msg = 'No N9K-C93180YC-FX3 switches found. Skipping.' | ||
| else: | ||
| proc_mem_mos = icurl('class', 'procMemUsage.json') |
There was a problem hiding this comment.
use API ?query-target-filter=lt(procMemUsage.total,"<32gigs>") logic to reduce the churn.
alternatively, the icurl function could be updated to take in a node specifier in the URI:
/api/node-xxx/class/procMemUsage...
| missing_nodes = [] | ||
|
|
||
| for node in affected_nodes: | ||
| node_id = node['fabricNode']['attributes']['id'] | ||
| total_kb = node_total_kb.get(node_id) | ||
| if total_kb is None: | ||
| missing_nodes.append([ | ||
| node_id, | ||
| node['fabricNode']['attributes']['name'], | ||
| node['fabricNode']['attributes']['model'], | ||
| ]) | ||
| continue | ||
|
|
||
| if total_kb < min_memory_kb: | ||
| memory_in_gb = round(total_kb / 1000000, 2) | ||
| result = MANUAL | ||
| data.append([ | ||
| node_id, | ||
| node['fabricNode']['attributes']['name'], | ||
| node['fabricNode']['attributes']['model'], | ||
| memory_in_gb, | ||
| ]) | ||
|
|
||
| if missing_nodes and data: | ||
| result = MANUAL | ||
| msg = ( | ||
| 'Some N9K-C93180YC-FX3 nodes have insufficient memory and others are missing ' | ||
| 'procMemUsage data. Please manually verify the memory on all affected nodes.\n' | ||
| 'Nodes with insufficient memory: {}\n' | ||
| 'Nodes with missing data: {}'.format( | ||
| ', '.join(str(row[0]) for row in data), | ||
| ', '.join(str(row[0]) for row in missing_nodes), | ||
| ) | ||
| ) | ||
| headers = ['NodeId', 'Name', 'Model', 'Memory Detected (GB)'] | ||
| data = data + [row + ['N/A'] for row in missing_nodes] | ||
| elif missing_nodes: | ||
| result = ERROR | ||
| msg = 'Missing procMemUsage data for one or more affected N9K-C93180YC-FX3 nodes.' | ||
| headers = ['NodeId', 'Name', 'Model'] | ||
| data = missing_nodes | ||
| recommended_action = '' |
There was a problem hiding this comment.
In what scenario do we see missing nodes? impication is that either fabricNode or procMemUsage is not being returned for a switch actively in the fabric.
if this was seen in testing, file the behavior as a bug and keep the logic with that explanation. If not, I don't know of any scenario where we would expect missing nodes for the sake of flagging them.
Summary
This PR adds a new validation check: N9300 Switch Memory.
The check runs only when the target upgrade version is 6.1 and validates that N9300-series switches have at least 24 GB memory before upgrade.
What Changed
n9300_switch_memory_checkinaci-preupgrade-validation-script.pydocs/docs/validations.mdtests/checks/n9300_switch_memory_24g_check/Check Behavior
MANUALif target version is missingN/Aif target version is not 6.1N/Aif no N9300 switches are presentFAIL_Ofor N9300 nodes with memory< 24 GBPASSwhen all applicable N9300 nodes have>= 24 GB##Test Validation
Executed:
pytest tests/checks/n9300_switch_memory_24g_check/test_n9300_switch_memory_24g_check.py -qResult:
7 passed in 0.11sNo failures observed.