fix: use ungated llama tokenizer mirrors#90
Merged
Conversation
ApprovabilityVerdict: Needs human review This PR changes runtime tokenizer loading behavior by redirecting Meta Llama models to load from unsloth mirrors while preserving canonical names for renderer auto-resolution. The new source override logic and name preservation mechanism warrant human review to verify correctness. You can customize Macroscope's approvability policy. Learn more. |
hallerite
approved these changes
Jun 19, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Note
Medium Risk
Central tokenizer loading now depends on third-party mirror repos for two production model IDs; a bad mirror change could affect templates/encoding until overrides are reviewed, though trust_remote_code stays off and overrides are narrowly scoped.
Overview
Gated Meta Llama-3.2 Instruct tokenizers can be loaded without HuggingFace license access by routing
load_tokenizer(and offset-tokenizer reloads) through auditedunslothmirror repos while callers still pass canonicalmeta-llama/Llama-3.2-*-InstructIDs.Adds
TOKENIZER_SOURCE_OVERRIDESplus helpers that pick the load repo, apply existing trust/revision policy on the mirror path, and rewritetokenizer.name_or_pathback to the requested Meta ID soMODEL_RENDERER_MAPauto-resolution still picksLlama3Renderer.Shared test matrices now use the canonical Meta model name with
"auto"instead of calling the mirror directly; new unit tests cover mirror selection, name preservation, and offset-tokenizer behavior.Reviewed by Cursor Bugbot for commit 0dc19a0. Bugbot is set up for automated code reviews on this repo. Configure here.
Note
Fix tokenizer loading for gated Meta Llama-3.2 models by routing to ungated unsloth mirrors
TOKENIZER_SOURCE_OVERRIDESin renderers/base.py mapping canonicalmeta-llama/Llama-3.2-1B-Instructand3B-InstructIDs to their ungatedunslothmirrors, so tokenizer loading no longer fails for users without Hugging Face access to the gated repos._tokenizer_source_forand_tokenizer_load_kwargshelpers to apply overrides and compute trust/revision kwargs consistently acrossload_tokenizerand_get_offset_tokenizer._preserve_requested_tokenizer_nameto ensure the returned tokenizer'sname_or_pathalways reflects the originally requested canonical model ID, not the mirror path.meta-llama/IDs and adds coverage for mirror routing, name preservation, and offset-tokenizer reload behavior.Macroscope summarized 0dc19a0.