[bugfix] fix tie_word_embeddings#74
Conversation
There was a problem hiding this comment.
Code Review
This pull request modifies the configuration parser to preserve the untie_embeddings_and_output_weights setting in megatron_config when merging sub-configurations for Qwen3-VL models. A review comment identifies a potential bug where the setting is unconditionally overwritten with None if it was not previously defined, and provides a suggestion to only restore the value if it was explicitly present.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request modifies the configuration parser to preserve the untie_embeddings_and_output_weights setting when merging sub-configurations, specifically to support Qwen3-VL models. Feedback suggests moving this preservation logic outside the configuration loop to improve efficiency and prevent unintended side effects where the first sub-configuration's value might override others if the parent configuration is missing the key.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request updates the configuration parser in src/mcore_bridge/config/parser.py to ensure that the untie_embeddings_and_output_weights setting is preserved when merging sub-configurations. The reviewer suggested generalizing the inline comment to improve maintainability and recommended investigating whether other configuration flags, such as moe_router_pre_softmax, also require protection from being overridden by conflicting defaults in sub-configs.
| # fix Qwen/Qwen3-VL-30B-A3B-Thinking | ||
| untie_embeddings_and_output_weights = megatron_config.get('untie_embeddings_and_output_weights') |
There was a problem hiding this comment.
The comment is model-specific; it is better to describe the general logic of preserving parent configuration settings to ensure maintainability as more models are added. Additionally, consider if other boolean flags that are inverted during conversion (such as moe_router_pre_softmax at line 78) should also be protected from being overridden by sub-configurations with conflicting defaults.
# Preserve parent config's settings as sub-configs might have conflicting defaults.
# This is particularly important for 'untie_embeddings_and_output_weights'.
untie_embeddings_and_output_weights = megatron_config.get('untie_embeddings_and_output_weights')
Uh oh!
There was an error while loading. Please reload this page.