Skip to content

Enable Indexer cache for DS v3.2 decoding#3195

Open
RissyRan wants to merge 2 commits intomainfrom
dsv32_decode
Open

Enable Indexer cache for DS v3.2 decoding#3195
RissyRan wants to merge 2 commits intomainfrom
dsv32_decode

Conversation

@RissyRan
Copy link
Collaborator

@RissyRan RissyRan commented Feb 19, 2026

Description

Enable Indexer cache for DS v3.2 decoding, to unblock the eval benchmark for DS v3.2 model with sparse attention bringup.

  • DS reference implementation for Indexer is here
  • Add init_indexer_cache & update_indexer_cache for indexer cache
  • Other small changes

Tests

  • All runners are green
  • Training end-to-end (no impact) with a smaller model version: link
  • Test against reference implementation still green: link
  • Decoding gives reasonable
    • small seq len to skip the indexer: link
    • large seq len to process the indexer max_prefill_predict_length=3072 max_target_length=4096: link

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@github-actions
Copy link

🤖 Hi @RissyRan, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

@codecov
Copy link

codecov bot commented Feb 19, 2026

Codecov Report

❌ Patch coverage is 86.95652% with 6 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/maxtext/layers/attention_mla.py 84.21% 2 Missing and 4 partials ⚠️

📢 Thoughts on this report? Let us know!

@github-actions
Copy link

🤖 I'm sorry @RissyRan, but I was unable to process your request. Please see the logs for more details.

Copy link
Collaborator

@shuningjin shuningjin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! There were some attention autoregressive/decoding tests:

Shall we add a similar one to test indexer cache in mla? example from gemini

@RissyRan RissyRan force-pushed the dsv32_decode branch 3 times, most recently from a470971 to 9f2ba53 Compare February 20, 2026 08:29
@RissyRan
Copy link
Collaborator Author

Thank you! There were some attention autoregressive/decoding tests:

Shall we add a similar one to test indexer cache in mla? example from gemini

Sounds good. Also enabled the mla assertion, and did some sanity check for decoding: long seq, short seq

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments