Added Eagle training support for Kimi-K2 by xuhaojie-2025 · Pull Request #108 · sgl-project/SpecForge

xuhaojie-2025 · 2025-08-03T14:18:57Z

add support for Kimi-K2 eagle train

add target model for Kimi-K2 in specforge/modeling/target/kimi_k2.py
add Kimi-K2 config in configs/kimi-k2-eagle3.json
fix chat template in specforge/data/template.py
When generating the hidden layer, the special dialogue template of Kimi-K2 has been adapted in specforge/data/preprocessing.py
The tokenizer of the Kimi-K2 model cannot automatically use the fasttokenizer. A script is used to generate tokenizer.json, enabling it to use the fasttokenizer interface.

…hidden states generation (sgl-project#57) * add local data path support and more assistant * small refactor * separate out the data-preprocess logic

* add support for qwen3 eagle train * fix * Update README.md * fix * fix and add test * fix code style * feat: add training scripts for qwen3-8B Co-authored-by: sleepcoo <sleepcoo@gmail.com> * fix * add 235B config * fix chat template * fix chat template --------- Co-authored-by: Yubo Wang <yubowang2019@gmail.com>

Co-authored-by: lukec <118525388+sleepcoo@users.noreply.github.com>

* updated badges * Update README.md --------- Co-authored-by: lukec <118525388+sleepcoo@users.noreply.github.com>

* add wandb args check * fix * opt error log * remove local

gemini-code-assist · 2025-08-03T14:19:06Z

Warning

Gemini is unable to generate a summary due to a potential policy violation.

sleepcoo · 2025-08-04T02:31:08Z

Can you fix the conflict?

xuhaojie-2025 · 2025-08-04T02:48:25Z

Can you fix the conflict?

…pstream/main

xuhaojie-2025 · 2025-08-04T03:28:04Z

Can you fix the conflict?

I have resolved the conflict based on upstream/main and re - submitted the code.

…pstream/main

jvmncs · 2025-08-27T16:36:10Z

specforge/modeling/target/qwen3_moe.py

        self.head_dim = getattr(
            config, "head_dim", config.hidden_size // config.num_attention_heads
        )
+<<<<<<< HEAD


looks like this conflict snuck into the last commit

jondurbin · 2025-09-18T18:20:03Z

@xuhaojie-2025 Trying to use this for kimi-k2-0905 but having a bit of a time getting it working. Library issues, some stray bad lines, not using trust_remote_code in various places, outdated kimi_k2.py with bad refs to qk_head_dim, etc. I can struggle through but I'm wondering if perhaps you have an updated or functional branch/commit somewhere I can look at?

FlamingoPg and others added 14 commits July 23, 2025 02:49

update huggingface path

47c077e

Update README.md (sgl-project#52)

3af76ea

docs: update README (sgl-project#54)

bcb6b59

fix code format (sgl-project#55)

ba1ba0d

chore: add acknowledgements (sgl-project#56)

cf9a68d

Add local data path support and separate out build dataset code from …

f5bcf09

…hidden states generation (sgl-project#57) * add local data path support and more assistant * small refactor * separate out the data-preprocess logic

Fix draft_vocab_size < existing_tokens (sgl-project#76)

f3cf082

fix bug for skip count (sgl-project#72)

5e56bef

Co-authored-by: lukec <118525388+sleepcoo@users.noreply.github.com>

updated badges (sgl-project#82)

ab0b700

* updated badges * Update README.md --------- Co-authored-by: lukec <118525388+sleepcoo@users.noreply.github.com>

fix: enable custom max_len for building offline dataset (sgl-project#83)

a8a2ea9

add wandb args check (sgl-project#80)

fcb9ea0

* add wandb args check * fix * opt error log * remove local

Adapted the training of Kimi-K2

8c22724

Added Eagle training support for Kimi-K2

2f14bdf

xuhaojie-2025 requested review from FlamingoPg, FrankLeeeee, shuaills, sleepcoo, zhyncs and zyksir as code owners August 3, 2025 14:18

xuhaojie-2025 closed this Aug 4, 2025

xuhaojie-2025 reopened this Aug 4, 2025

Added Eagle training support for Kimi-K2 and Resolve conflicts with u…

6e369a9

…pstream/main

root and others added 2 commits August 4, 2025 11:30

Added Eagle training support for Kimi-K2 and Resolve conflicts with u…

ba3389b

…pstream/main

Merge branch 'main' into feat/kimi-k2

a603394

jvmncs reviewed Aug 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added Eagle training support for Kimi-K2#108

Added Eagle training support for Kimi-K2#108
xuhaojie-2025 wants to merge 17 commits intosgl-project:mainfrom
bytedance-iaas:feat/kimi-k2

xuhaojie-2025 commented Aug 3, 2025

Uh oh!

gemini-code-assist bot commented Aug 3, 2025

Uh oh!

sleepcoo commented Aug 4, 2025

Uh oh!

xuhaojie-2025 commented Aug 4, 2025

Uh oh!

xuhaojie-2025 commented Aug 4, 2025

Uh oh!

jvmncs Aug 27, 2025

Uh oh!

jondurbin commented Sep 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants

Conversation

xuhaojie-2025 commented Aug 3, 2025

Uh oh!

gemini-code-assist bot commented Aug 3, 2025

Uh oh!

sleepcoo commented Aug 4, 2025

Uh oh!

xuhaojie-2025 commented Aug 4, 2025

Uh oh!

xuhaojie-2025 commented Aug 4, 2025

Uh oh!

jvmncs Aug 27, 2025

Choose a reason for hiding this comment

Uh oh!

jondurbin commented Sep 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants