Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions src/mcore_bridge/config/parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,9 +93,13 @@ def _convert_config(config, _internal_call=False) -> Dict[str, Any]:
k = 'llm_model_type'
megatron_config[k] = hf_v
break
# fix Qwen/Qwen3-VL-30B-A3B-Thinking
untie_embeddings_and_output_weights = megatron_config.get('untie_embeddings_and_output_weights')
Comment on lines +96 to +97
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The comment is model-specific; it is better to describe the general logic of preserving parent configuration settings to ensure maintainability as more models are added. Additionally, consider if other boolean flags that are inverted during conversion (such as moe_router_pre_softmax at line 78) should also be protected from being overridden by sub-configurations with conflicting defaults.

    # Preserve parent config's settings as sub-configs might have conflicting defaults.
    # This is particularly important for 'untie_embeddings_and_output_weights'.
    untie_embeddings_and_output_weights = megatron_config.get('untie_embeddings_and_output_weights')

for key in ['text_config', 'llm_config', 'thinker_config']:
if hasattr(config, key):
megatron_config.update(_convert_config(getattr(config, key), _internal_call=True))
if untie_embeddings_and_output_weights is not None:
megatron_config['untie_embeddings_and_output_weights'] = untie_embeddings_and_output_weights
# compat llama3
if getattr(config, 'rope_scaling', None) is not None:
if isinstance(config.rope_scaling, int):
Expand Down
Loading