feat: add Magpie TTS CoreML conversion pipeline#24
Conversation
3be1e15 to
d161a71
Compare
| os.makedirs(os.path.dirname(output_path), exist_ok=True) | ||
| mlmodel.save(output_path) |
There was a problem hiding this comment.
🟡 Duplicate model save due to copy-paste error
Lines 90-94 in convert_decoder_step.py contain a duplicated block that calls os.makedirs and mlmodel.save twice in succession. This is clearly a copy-paste artifact — the model is saved, then immediately saved again. While not functionally incorrect (the second write overwrites the first), it wastes significant I/O time since CoreML .mlpackage bundles can be hundreds of megabytes.
Was this helpful? React with 👍 or 👎 to provide feedback.
NVIDIA Magpie TTS Multilingual (357M) conversion to CoreML. Pipeline (4 models): - text_encoder: text tokenization and encoding - decoder_prefill: batch speaker context into KV cache - decoder_step: single AR step with KV cache - nanocodec_decoder: codec tokens to 22kHz audio 9 languages (en, es, de, fr, it, vi, zh, hi, ja), 5 speakers. Includes conversion scripts, traceable wrappers, export scripts for embeddings/tokenizers/weights, and CoreML inference script. Source: nvidia/magpie_tts_multilingual_357m
d161a71 to
c04cbcb
Compare
| d_head = d_model // sa_n_heads | ||
|
|
||
| # Read T_ctx from speaker_info if not specified | ||
| constants_dir = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "constants") |
There was a problem hiding this comment.
🟡 Wrong parent directory calculation causes constants path to resolve to incorrect location
The constants_dir at line 41 uses os.path.dirname(os.path.dirname(os.path.abspath(__file__))) which traverses up two directory levels. This was written assuming the script lives in a convert/ subdirectory (as documented in the README's python convert/convert_decoder_prefill.py), but the script is actually placed directly in coreml/. As a result, constants_dir resolves to models/tts/magpie/constants/ instead of the correct models/tts/magpie/coreml/constants/ where export_constants.py writes speaker_info.json. The os.path.exists(si_path) check silently returns False, causing the script to always use the hardcoded default t_ctx=110 instead of reading the value from the exported speaker info. The same off-by-one dirname pattern appears in the sys.path.insert at line 18, though that doesn't cause a runtime failure because Python's default sys.path[0] (the script's own directory) already contains the traceable/ package.
| constants_dir = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "constants") | |
| constants_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), "constants") |
Was this helpful? React with 👍 or 👎 to provide feedback.
| [project] | ||
| name = "magpie-tts-coreml" | ||
| requires-python = ">= 3.10,<3.13" | ||
| description = "NVIDIA Magpie TTS 357M CoreML conversion" | ||
| version = "0.1.0" | ||
| dependencies = [ | ||
| "numpy>=1.24", | ||
| "torch>=2.5.0", | ||
| "coremltools>=8.0", | ||
| "soundfile>=0.12.0", | ||
| "scipy>=1.5.0", | ||
| "huggingface_hub>=0.10", | ||
| ] | ||
|
|
||
| [project.optional-dependencies] | ||
| nemo = [ | ||
| "nemo_toolkit[tts]", | ||
| "hydra-core>=1.3", | ||
| "omegaconf>=2.3", | ||
| "lightning>=2.0", | ||
| ] | ||
|
|
||
| [tool.uv.sources] | ||
| torch = [ | ||
| { index = "pytorch-cpu" }, | ||
| ] | ||
|
|
||
| [[tool.uv.index]] | ||
| name = "pytorch-cpu" | ||
| url = "https://download.pytorch.org/whl/cpu" | ||
| explicit = true | ||
|
|
||
| [tool.hatch.build.targets.wheel] | ||
| packages = ["."] | ||
|
|
||
| [build-system] | ||
| requires = ["hatchling"] | ||
| build-backend = "hatchling.build" | ||
|
|
||
| [tool.uv] | ||
| python-preference = "only-managed" |
There was a problem hiding this comment.
🔴 pyproject.toml placed in model directory instead of target directory, violating AGENTS.md structure rules
AGENTS.md mandates: "Code lives under models/{class}/{model}/{target}; follow the existing vad/silero-vad/coreml pattern" and "Each target directory bundles its own pyproject.toml, uv.lock". The existing reference pattern places these files at models/vad/silero-vad/coreml/pyproject.toml. This PR places pyproject.toml at models/tts/magpie/pyproject.toml (one level above the coreml/ target directory) instead of the required models/tts/magpie/coreml/pyproject.toml. This breaks the self-contained target directory convention and means uv sync run from the coreml/ directory (as instructed by AGENTS.md) won't find the project file.
Prompt for agents
Move models/tts/magpie/pyproject.toml to models/tts/magpie/coreml/pyproject.toml and move models/tts/magpie/uv.lock to models/tts/magpie/coreml/uv.lock. This matches the established pattern in models/vad/silero-vad/coreml/ where each target directory is self-contained with its own pyproject.toml and uv.lock. After moving, verify that uv sync works correctly when run from the coreml/ directory.
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
generate_coreml.py) and PyTorch reference (generate_pytorch.py)Pipeline
text_encoderdecoder_prefilldecoder_stepnanocodec_decoderSource