Add triton runtime #88

fyuan1316 · 2026-01-28T09:39:20Z

Summary by CodeRabbit

Documentation
- Added Triton Inference Server as a new runtime example with full configuration and step-by-step usage for NVIDIA GPU deployments.
- Updated the runtime comparisons table to include Triton's target hardware, supported frameworks, and configuration notes.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-28T09:39:36Z

Walkthrough

Adds a Triton Inference Server example to the custom inference runtime docs, including a full ClusterServingRuntime YAML configured for NVIDIA GPUs, startup commands/env vars, resource settings, startupProbe, supportedModelFormats, usage steps, and an update to the runtime comparison table.

Changes

Cohort / File(s)	Summary
Documentation: Triton Inference Server Runtime `docs/en/model_inference/inference_service/how_to/custom_inference_runtime.mdx`	Adds a new "Specific Runtime Examples" entry for Triton with a full ClusterServingRuntime YAML (GPU accelerator metadata, container args/env, resources, startupProbe, supportedModelFormats), usage/preparation steps, and a new row in the runtime comparison table.

Sequence Diagram(s)

(omitted)

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

Possibly related PRs

Update custom infer runtime docs #35: Modifies the same custom_inference_runtime.mdx to add/expand custom runtime examples (related additions for other runtimes).
Add ascend 310p runtime #78: Updates custom_inference_runtime.mdx with additional specific runtime examples and comparison table edits.
AI-22071 Add MLServer runtime #14: Adds other custom runtime YAML examples to the same documentation page.

Suggested reviewers

typhoonzero

Poem

🐰 I hopped through YAML, neat and spry,
Triton lights the GPU sky,
Containers, probes, and models align,
Docs now hum with runtime shine,
🥕🚀

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Add triton runtime' directly describes the main change: adding documentation for Triton Inference Server runtime with YAML configuration and usage steps.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In
`@docs/en/model_inference/inference_service/how_to/custom_inference_runtime.mdx`:
- Around line 329-333: Add a Kubernetes startupProbe entry to the Triton runtime
YAML so the pod is not considered ready until the model finishes loading;
specifically, insert a startupProbe block (mirroring the pattern used in other
runtimes) immediately before the supportedModelFormats section in the Triton
runtime example (near the runAsUser key and before supportedModelFormats: -
name: triton) to probe the model server endpoint until it is healthy.

🧹 Nitpick comments (2)

docs/en/model_inference/inference_service/how_to/custom_inference_runtime.mdx (2)
308-312: Unused environment variable MODEL_REPO.

The MODEL_REPO environment variable is defined on lines 311-312 but is not used anywhere in the container command (lines 302-307). Either remove it or use it in the command if it's intended for some purpose.
🔧 Suggested fix: Remove unused environment variable
      env:
        - name: OMP_NUM_THREADS
          value: "1"
-        - name: MODEL_REPO
-          value: '{{ index .Annotations "aml-model-repo" }}'
      image: 152-231-registry.alauda.cn:60070/mlops/tritonserver:25.02-py3
313-313: Internal registry image may not be accessible to users.

The image 152-231-registry.alauda.cn:60070/mlops/tritonserver:25.02-py3 appears to reference an internal registry. Consider adding a comment similar to other examples, or use the official NVIDIA NGC image reference (e.g., nvcr.io/nvidia/tritonserver:25.02-py3) for better accessibility.
🔧 Suggested fix: Use official NVIDIA image
-      image: 152-231-registry.alauda.cn:60070/mlops/tritonserver:25.02-py3
+      image: nvcr.io/nvidia/tritonserver:25.02-py3  # Replace with your actual image if needed

docs/en/model_inference/inference_service/how_to/custom_inference_runtime.mdx

cloudflare-workers-and-pages · 2026-01-28T09:46:08Z

Deploying alauda-ai with Cloudflare Pages

Latest commit:	`217ba40`
Status:	✅ Deploy successful!
Preview URL:	https://49a3ab18.alauda-ai.pages.dev
Branch Preview URL:	https://add-triton-rt.alauda-ai.pages.dev

View logs

add triton runtime

1e96fa1

coderabbitai bot reviewed Jan 28, 2026

View reviewed changes

docs/en/model_inference/inference_service/how_to/custom_inference_runtime.mdx Show resolved Hide resolved

fyuan1316 added 2 commits January 28, 2026 18:07

update image tag

56eebf9

add startup probe

217ba40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add triton runtime #88

Add triton runtime #88

Uh oh!

fyuan1316 commented Jan 28, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 28, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

cloudflare-workers-and-pages bot commented Jan 28, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add triton runtime #88

Are you sure you want to change the base?

Add triton runtime #88

Uh oh!

Conversation

fyuan1316 commented Jan 28, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cloudflare-workers-and-pages bot commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying alauda-ai with Cloudflare Pages

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fyuan1316 commented Jan 28, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 28, 2026 •

edited

Loading

cloudflare-workers-and-pages bot commented Jan 28, 2026 •

edited

Loading