Skip to content

[llm-d-legacy] Import the TOPSAIL legacy LLM-D project for more advance testing#42

Merged
kpouget merged 4 commits intoopenshift-psap:mainfrom
kpouget:llm_d_legacy
Apr 27, 2026
Merged

[llm-d-legacy] Import the TOPSAIL legacy LLM-D project for more advance testing#42
kpouget merged 4 commits intoopenshift-psap:mainfrom
kpouget:llm_d_legacy

Conversation

@kpouget
Copy link
Copy Markdown
Contributor

@kpouget kpouget commented Apr 24, 2026

Summary by CodeRabbit

Release Notes

  • New Features
    • Added comprehensive LLM inference service testing, deployment, and benchmarking framework.
    • Introduced new Grafana dashboards for monitoring container, Kubernetes, GPU, vLLM, and workload metrics.
    • Added performance analysis and benchmarking visualization capabilities with regression tracking.
    • Implemented CI orchestration and cluster preparation workflows for LLM testing.

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 24, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign albertoperdomo2 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 24, 2026

Warning

Rate limit exceeded

@kpouget has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 47 minutes and 37 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8135a592-dbd0-43bf-af6b-84447dcb49ad

📥 Commits

Reviewing files that changed from the base of the PR and between 6d1d536 and 7fc6400.

📒 Files selected for processing (68)
  • projects/core/library/ci.py
  • projects/legacy/library/run.py
  • projects/llm_d_legacy/orchestration/ci.py
  • projects/llm_d_legacy/testing/command_args.yml.j2
  • projects/llm_d_legacy/testing/config.yaml
  • projects/llm_d_legacy/testing/epp-config/epp-approximate-prefix-cache.yaml
  • projects/llm_d_legacy/testing/epp-config/epp-default-rhoai-3.4-ea.1.yaml
  • projects/llm_d_legacy/testing/epp-config/epp-pd.v0.4.yaml
  • projects/llm_d_legacy/testing/epp-config/epp-pd.v0.6.yaml
  • projects/llm_d_legacy/testing/epp-config/epp-precise-prefix-cache.yaml
  • projects/llm_d_legacy/testing/grafana/dashboards/container-overview.yaml
  • projects/llm_d_legacy/testing/grafana/dashboards/k8s-dashboard-starslcn.yaml
  • projects/llm_d_legacy/testing/grafana/dashboards/kubernetes-app-metrics.yaml
  • projects/llm_d_legacy/testing/grafana/dashboards/kubernetes-by-namespace-instance.yaml
  • projects/llm_d_legacy/testing/grafana/dashboards/llmd-vllm-wva.yaml
  • projects/llm_d_legacy/testing/grafana/dashboards/nvidia.yaml
  • projects/llm_d_legacy/testing/grafana/dashboards/vllm.yaml
  • projects/llm_d_legacy/testing/grafana/dashboards/workload-variant-autoscaler.yaml
  • projects/llm_d_legacy/testing/grafana/datasource.yaml
  • projects/llm_d_legacy/testing/llmisvcs/llmisvc-pd.yaml
  • projects/llm_d_legacy/testing/llmisvcs/llmisvc-simple.yaml
  • projects/llm_d_legacy/testing/prepare_llmd.py
  • projects/llm_d_legacy/testing/test.py
  • projects/llm_d_legacy/testing/test_llmd.py
  • projects/llm_d_legacy/toolbox/llmd.py
  • projects/llm_d_legacy/toolbox/llmd_capture_isvc_state/defaults/main/config.yml
  • projects/llm_d_legacy/toolbox/llmd_capture_isvc_state/tasks/main.yml
  • projects/llm_d_legacy/toolbox/llmd_capture_isvc_state/vars/main/resources.yml
  • projects/llm_d_legacy/toolbox/llmd_deploy_gateway/defaults/main/config.yml
  • projects/llm_d_legacy/toolbox/llmd_deploy_gateway/files/.keep
  • projects/llm_d_legacy/toolbox/llmd_deploy_gateway/meta/main.yml
  • projects/llm_d_legacy/toolbox/llmd_deploy_gateway/tasks/main.yml
  • projects/llm_d_legacy/toolbox/llmd_deploy_gateway/templates/gateway.yaml.j2
  • projects/llm_d_legacy/toolbox/llmd_deploy_gateway/vars/main/resources.yml
  • projects/llm_d_legacy/toolbox/llmd_deploy_llm_inference_service/defaults/main/config.yml
  • projects/llm_d_legacy/toolbox/llmd_deploy_llm_inference_service/tasks/main.yml
  • projects/llm_d_legacy/toolbox/llmd_deploy_llm_inference_service/vars/main/resources.yml
  • projects/llm_d_legacy/toolbox/llmd_run_guidellm_benchmark/defaults/main/config.yml
  • projects/llm_d_legacy/toolbox/llmd_run_guidellm_benchmark/tasks/main.yml
  • projects/llm_d_legacy/toolbox/llmd_run_guidellm_benchmark/templates/copy_helper_pod.yaml.j2
  • projects/llm_d_legacy/toolbox/llmd_run_guidellm_benchmark/templates/guidellm_benchmark_job.yaml.j2
  • projects/llm_d_legacy/toolbox/llmd_run_guidellm_benchmark/templates/guidellm_benchmark_pvc.yaml.j2
  • projects/llm_d_legacy/toolbox/llmd_run_guidellm_benchmark/vars/main/resources.yml
  • projects/llm_d_legacy/toolbox/storage_download_to_pvc/defaults/main/config.yml
  • projects/llm_d_legacy/toolbox/storage_download_to_pvc/files/entrypoint.sh
  • projects/llm_d_legacy/toolbox/storage_download_to_pvc/meta/main.yml
  • projects/llm_d_legacy/toolbox/storage_download_to_pvc/tasks/main.yml
  • projects/llm_d_legacy/toolbox/storage_download_to_pvc/templates/pod.yml.j2
  • projects/llm_d_legacy/toolbox/storage_download_to_pvc/templates/pvc.yml.j2
  • projects/llm_d_legacy/toolbox/storage_download_to_pvc/vars/main/resources.yml
  • projects/llm_d_legacy/visualizations/llmd_inference/analyze/__init__.py
  • projects/llm_d_legacy/visualizations/llmd_inference/data/plots.yaml
  • projects/llm_d_legacy/visualizations/llmd_inference/models/__init__.py
  • projects/llm_d_legacy/visualizations/llmd_inference/models/kpi.py
  • projects/llm_d_legacy/visualizations/llmd_inference/models/lts.py
  • projects/llm_d_legacy/visualizations/llmd_inference/plotting/__init__.py
  • projects/llm_d_legacy/visualizations/llmd_inference/plotting/error_report.py
  • projects/llm_d_legacy/visualizations/llmd_inference/plotting/prometheus.py
  • projects/llm_d_legacy/visualizations/llmd_inference/plotting/prometheus_reports.py
  • projects/llm_d_legacy/visualizations/llmd_inference/plotting/report.py
  • projects/llm_d_legacy/visualizations/llmd_inference/plotting/throughput_analysis.py
  • projects/llm_d_legacy/visualizations/llmd_inference/plotting/throughput_comparisons.py
  • projects/llm_d_legacy/visualizations/llmd_inference/plotting/utils.py
  • projects/llm_d_legacy/visualizations/llmd_inference/plotting/vllm_metrics.py
  • projects/llm_d_legacy/visualizations/llmd_inference/requirements.txt
  • projects/llm_d_legacy/visualizations/llmd_inference/store/__init__.py
  • projects/llm_d_legacy/visualizations/llmd_inference/store/lts_parser.py
  • projects/llm_d_legacy/visualizations/llmd_inference/store/parsers.py
📝 Walkthrough

Walkthrough

This PR introduces comprehensive LLM-D testing infrastructure, including CI orchestration, cluster preparation, Kubernetes inference service deployments, Grafana monitoring dashboards, benchmark running with GuideLLM, visualization pipelines, and matrix benchmarking support. It also updates CI error-handling logic in core and legacy libraries.

Changes

Cohort / File(s) Summary
CI and Error Handling
projects/core/library/ci.py, projects/legacy/library/run.py
Minor updates: conditional FAILURE file writing based on env.ARTIFACT_DIR presence, and switching from sys.exit(1) to raise SystemExit(1) during exception handling.
LLM-D CI Orchestration
projects/llm_d_legacy/orchestration/ci.py
New CLI entrypoint with main command group and test subcommand; wraps test execution with logging, environment initialization, caliper export, and structured exit codes.
LLM-D Testing Configuration and Execution
projects/llm_d_legacy/testing/config.yaml, projects/llm_d_legacy/testing/command_args.yml.j2, projects/llm_d_legacy/testing/test.py, projects/llm_d_legacy/testing/test_llmd.py, projects/llm_d_legacy/testing/prepare_llmd.py
Core testing modules: configuration for vaults, cluster profiles, models, and benchmarking; CLI entrypoints for test orchestration; test execution with multi-flavor ISVC deployment, GuideLLM benchmarking, and Prometheus capture; cluster preparation including operators, Grafana, monitoring, pull-secret management, model PVC downloads, and GPU readiness.
LLM-D Kubernetes Manifests
projects/llm_d_legacy/testing/epp-config/*.yaml, projects/llm_d_legacy/testing/llmisvcs/*.yaml
Kubernetes custom resources: endpoint picker configurations (queue/cache/max-score scoring), and LLM inference service definitions (with pod templates, routing, GPU resources, prefill workloads, and VLLM tuning).
LLM-D Grafana Monitoring
projects/llm_d_legacy/testing/grafana/dashboards/*.yaml, projects/llm_d_legacy/testing/grafana/datasource.yaml
Grafana custom resources: datasource configuration (Thanos/Prometheus), and seven dashboards covering container overview, Kubernetes app metrics, namespace/instance filtering, GPU (NVIDIA) metrics, vLLM inference metrics, workload autoscaler, and baseline resource monitoring.
LLM-D Ansible Toolbox Roles
projects/llm_d_legacy/toolbox/llmd*.py, projects/llm_d_legacy/toolbox/llmd_*/..., projects/llm_d_legacy/toolbox/storage_download_to_pvc/...
Ansible role implementations: Llmd class with methods for gateway deployment, ISVC deployment, GuideLLM benchmark execution, ISVC state capture; supporting roles with defaults, tasks, templates, and vars for gateway setup, ISVC deployment, GuideLLM job orchestration, and PVC downloads (with multi-protocol support: HF, HTTPS/git, S3, DMF).
LLM-D Visualization and Analysis
projects/llm_d_legacy/visualizations/llmd_inference/...
Complete visualization pipeline: LTS data models (settings, metadata, results, KPIs), store/parser infrastructure for GuideLLM benchmarks and Prometheus metrics, and Dash-based reporting modules (error reports, Prometheus resource/GPU/system health, GuideLLM performance analysis and throughput scaling, baseline/routing/P-D comparisons, VLLM metrics).
Matrix Benchmarking Framework
projects/matrix_benchmarking/library/matbenchmark.py, projects/matrix_benchmarking/library/visualize.py, projects/matrix_benchmarking/subproject/...
Benchmarking execution (prepare/save/run matbench files and commands) and comprehensive visualization orchestration CLI with matrix parsing, LTS generation, historical downloads, regressions analysis via hunter/stdev/z-score methods, and per-filter visualization output. Includes plugin architecture documentation, executable wrapper, and regression analyzers.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as Test CLI
    participant Prepare as Cluster Prepare
    participant Deploy as ISVC Deploy
    participant Bench as GuideLLM Bench
    participant Capture as State Capture
    participant Visualize as Visualization

    CLI->>Prepare: prepare_ci()
    Prepare->>Prepare: Setup operators/namespace
    Prepare->>Prepare: Deploy Grafana/monitoring
    Prepare->>Prepare: Update pull secrets
    Prepare->>Prepare: Download models to PVC
    Prepare->>Prepare: Preload container images
    
    CLI->>Deploy: test_ci() per flavor
    Deploy->>Deploy: Parse & reshape ISVC YAML
    Deploy->>Deploy: Apply EPP routing config
    Deploy->>Deploy: Deploy LLMInferenceService
    Deploy->>Deploy: Wait for readiness
    
    Deploy->>Bench: run_guidellm_benchmark()
    Bench->>Bench: Create Job + PVC
    Bench->>Bench: Run GuideLLM benchmark
    Bench->>Bench: Extract results.json
    
    CLI->>Capture: capture_llm_inference_service_state()
    Capture->>Capture: Dump ISVC/pods/logs
    Capture->>Capture: Capture Prometheus metrics
    
    CLI->>Visualize: generate_visualization()
    Visualize->>Visualize: Parse results → LTS payload
    Visualize->>Visualize: Generate plots/reports
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

  • PR #20: Updates same projects/core/library/ci.py module for error-summary FAILURE file handling tied to env.ARTIFACT_DIR.
  • PR #17: Prior modifications to projects/core/library/ci.py error-reporting flow and _display_error_summary / _write_error_summary_to_file.
  • PR #6: Adds projects/core/library/env.py which defines ARTIFACT_DIR variable used in this PR's CI error-handling logic.

Poem

🐰 A testing warren grows today,
With grafana dashboards on display,
LLM benchmarks leap and bound,
Inference pipelines all around—
From cluster prep to metrics gleam,
This infrastructure is a dream! 🚀

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

@kpouget
Copy link
Copy Markdown
Contributor Author

kpouget commented Apr 24, 2026

/test fournos llm_d_legacy psap_h200 intelligentrouting-flavors
/cluster athena-fire
/var fournos.namespace: psap-automation-wip

@psap-forge-bot
Copy link
Copy Markdown

🔴 Test of 'llm_d_legacy test' failed after 00 hours 00 minutes 00 seconds 🔴

• Link to the test results.

• No reports index generated...

• No test configuration (variable_overrides.yaml) available.

• Failure indicator: Empty.
Execution logs

@psap-forge-bot
Copy link
Copy Markdown

🔴 Test of 'fournos_launcher submit' failed after 00 hours 00 minutes 31 seconds 🔴

• Link to the test results.

• No reports index generated...

Test configuration:

/test fournos llm_d_legacy psap_h200 intelligentrouting-flavors
/cluster athena-fire
/var fournos.namespace: psap-automation-wip

Failure indicator:

## /logs/artifacts/FAILURE 
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
~~ projects/fournos_launcher/toolbox/submit_and_wait/main.py:169
~~ TASK: wait_for_job_completion: Wait for FOURNOS job to complete
~~ ARTIFACT_DIR: /logs/artifacts/001__submit_and_wait
~~ LOG_FILE: /logs/artifacts/001__submit_and_wait/task.log
~~ ARGS:
~~     cluster_name: athena-fire
~~     project: llm_d_legacy
~~     args:
~~     - psap_h200
~~     - intelligentrouting-flavors
~~     variables_overrides: {}
~~     job_name: ''
~~     namespace: psap-automation-wip
~~     owner: kpouget
~~     display_name: llm_d_legacy psap_h200 intelligentrouting-flavors
~~     pipeline_name: forge-test-only
~~     env:
~~       JOB_TYPE: presubmit
~~       JOB_NAME: pull-ci-openshift-psap-forge-main-fournos
~~       JOB_SPEC: '{"type":"presubmit","job":"pull-ci-openshift-psap-forge-main-fournos","buildid":"2047673414344249344","prowjobid":"b250c4d4-b130-4c29-9c76-e1d95fbdaafc","refs":{"org":"openshift-psap","repo":"forge","repo_link":"https://github.com/openshift-psap/forge","base_ref":"main","base_sha":"2e20a6a265b879a6b4edbe6c81afe14ca104d9d3","base_link":"https://github.com/openshift-psap/forge/commit/2e20a6a265b879a6b4edbe6c81afe14ca104d9d3","pulls":[{"number":42,"author":"kpouget","sha":"276f23d453ea7df4a3cb7f1193c8608e0cce2f06","title":"[llm-d-legacy]
~~         Import the TOPSAIL legacy LLM-D project for more advance testing","head_ref":"llm_d_legacy","link":"https://github.com/openshift-psap/forge/pull/42","commit_link":"https://github.com/openshift-psap/forge/pull/42/commits/276f23d453ea7df4a3cb7f1193c8608e0cce2f06","author_link":"https://github.com/kpouget"}]},"decoration_config":{"timeout":"23h0m0s","grace_period":"15s","utility_images":{"clonerefs":"us-docker.pkg.dev/k8s-infra-prow/images/clonerefs:v20260421-d25a17867","initupload":"us-docker.pkg.dev/k8s-infra-prow/images/initupload:v20260421-d25a17867","entrypoint":"us-docker.pkg.dev/k8s-infra-prow/images/entrypoint:v20260421-d25a17867","sidecar":"us-docker.pkg.dev/k8s-infra-prow/images/sidecar:v20260421-d25a17867"},"resources":{"clonerefs":{"limits":{"memory":"3Gi"},"requests":{"cpu":"100m","memory":"500Mi"}},"initupload":{"limits":{"memory":"200Mi"},"requests":{"cpu":"100m","memory":"50Mi"}},"place_entrypoint":{"limits":{"memory":"100Mi"},"requests":{"cpu":"100m","memory":"25Mi"}},"sidecar":{"limits":{"memory":"2Gi"},"requests":{"cpu":"100m","memory":"250Mi"}}},"gcs_configuration":{"bucket":"test-platform-results","path_strategy":"single","default_org":"openshift","default_repo":"origin","mediaTypes":{"log":"text/plain"},"compress_file_types":["txt","log","json","tar","html","yaml"]},"gcs_credentials_secret":"gce-sa-credentials-gcs-publisher","skip_cloning":true,"censor_secrets":true,"censoring_options":{"minimum_secret_length":6}}}'
~~       OPENSHIFT_CI: 'true'
~~       JOB_NAME_SAFE: fournos
~~       BUILD_ID: '2047673414344249344'
~~       PULL_PULL_SHA: 276f23d453ea7df4a3cb7f1193c8608e0cce2f06
~~       PULL_NUMBER: '42'
~~       PULL_BASE_REF: main
~~       REPO_NAME: forge
~~       REPO_OWNER: openshift-psap
~~       PULL_BASE_SHA: 2e20a6a265b879a6b4edbe6c81afe14ca104d9d3
~~       PULL_TITLE: '[llm-d-legacy] Import the TOPSAIL legacy LLM-D project for more advance
~~         testing'
~~       PULL_REFS: main:2e20a6a265b879a6b4edbe6c81afe14ca104d9d3,42:276f23d453ea7df4a3cb7f1193c8608e0cce2f06
~~       PULL_HEAD_REF: llm_d_legacy
~~     status_dest: /logs/artifacts
~~     ci_label: pr42_b2047673414344249344
~~     artifact_dir: /logs/artifacts/001__submit_and_wait
~~ CONTEXT:
~~     final_job_name: forge-llm-d-legacy-20260424-134714
~~     manifest_file: /logs/artifacts/001__submit_and_wait/src/forge-llm-d-legacy-20260424-134714-manifest.yaml
~~
~~ EXCEPTION: RuntimeError
~~     Job forge-llm-d-legacy-20260424-134714 failed: Tasks Completed: 1 (Failed: 1, Cancelled 0), Skipped: 0
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx


[...]

Execution logs

@kpouget
Copy link
Copy Markdown
Contributor Author

kpouget commented Apr 24, 2026

/test fournos llm_d_legacy psap_h200 intelligentrouting-flavors
/cluster athena-fire
/var fournos.namespace: psap-automation-wip

@psap-forge-bot
Copy link
Copy Markdown

🔴 Test of 'llm_d_legacy test' failed after 00 hours 00 minutes 00 seconds 🔴

• Link to the test results.

• No reports index generated...

• No test configuration (variable_overrides.yaml) available.

• Failure indicator: Empty.
Execution logs

@psap-forge-bot
Copy link
Copy Markdown

🔴 Test of 'fournos_launcher submit' failed after 00 hours 00 minutes 37 seconds 🔴

• Link to the test results.

• No reports index generated...

Test configuration:

/test fournos llm_d_legacy psap_h200 intelligentrouting-flavors
/cluster athena-fire
/var fournos.namespace: psap-automation-wip

Failure indicator:

## /logs/artifacts/FAILURE 
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
~~ projects/fournos_launcher/toolbox/submit_and_wait/main.py:169
~~ TASK: wait_for_job_completion: Wait for FOURNOS job to complete
~~ ARTIFACT_DIR: /logs/artifacts/001__submit_and_wait
~~ LOG_FILE: /logs/artifacts/001__submit_and_wait/task.log
~~ ARGS:
~~     cluster_name: athena-fire
~~     project: llm_d_legacy
~~     args:
~~     - psap_h200
~~     - intelligentrouting-flavors
~~     variables_overrides: {}
~~     job_name: ''
~~     namespace: psap-automation-wip
~~     owner: kpouget
~~     display_name: llm_d_legacy psap_h200 intelligentrouting-flavors
~~     pipeline_name: forge-test-only
~~     env:
~~       JOB_TYPE: presubmit
~~       JOB_NAME: pull-ci-openshift-psap-forge-main-fournos
~~       JOB_SPEC: '{"type":"presubmit","job":"pull-ci-openshift-psap-forge-main-fournos","buildid":"2047674267566346240","prowjobid":"5fbeafec-cbb4-48fd-a935-04dd40de3638","refs":{"org":"openshift-psap","repo":"forge","repo_link":"https://github.com/openshift-psap/forge","base_ref":"main","base_sha":"2e20a6a265b879a6b4edbe6c81afe14ca104d9d3","base_link":"https://github.com/openshift-psap/forge/commit/2e20a6a265b879a6b4edbe6c81afe14ca104d9d3","pulls":[{"number":42,"author":"kpouget","sha":"7a0f914235b877544de4cc35111a9f24eaf1b332","title":"[llm-d-legacy]
~~         Import the TOPSAIL legacy LLM-D project for more advance testing","head_ref":"llm_d_legacy","link":"https://github.com/openshift-psap/forge/pull/42","commit_link":"https://github.com/openshift-psap/forge/pull/42/commits/7a0f914235b877544de4cc35111a9f24eaf1b332","author_link":"https://github.com/kpouget"}]},"decoration_config":{"timeout":"23h0m0s","grace_period":"15s","utility_images":{"clonerefs":"us-docker.pkg.dev/k8s-infra-prow/images/clonerefs:v20260421-d25a17867","initupload":"us-docker.pkg.dev/k8s-infra-prow/images/initupload:v20260421-d25a17867","entrypoint":"us-docker.pkg.dev/k8s-infra-prow/images/entrypoint:v20260421-d25a17867","sidecar":"us-docker.pkg.dev/k8s-infra-prow/images/sidecar:v20260421-d25a17867"},"resources":{"clonerefs":{"limits":{"memory":"3Gi"},"requests":{"cpu":"100m","memory":"500Mi"}},"initupload":{"limits":{"memory":"200Mi"},"requests":{"cpu":"100m","memory":"50Mi"}},"place_entrypoint":{"limits":{"memory":"100Mi"},"requests":{"cpu":"100m","memory":"25Mi"}},"sidecar":{"limits":{"memory":"2Gi"},"requests":{"cpu":"100m","memory":"250Mi"}}},"gcs_configuration":{"bucket":"test-platform-results","path_strategy":"single","default_org":"openshift","default_repo":"origin","mediaTypes":{"log":"text/plain"},"compress_file_types":["txt","log","json","tar","html","yaml"]},"gcs_credentials_secret":"gce-sa-credentials-gcs-publisher","skip_cloning":true,"censor_secrets":true,"censoring_options":{"minimum_secret_length":6}}}'
~~       OPENSHIFT_CI: 'true'
~~       JOB_NAME_SAFE: fournos
~~       BUILD_ID: '2047674267566346240'
~~       PULL_PULL_SHA: 7a0f914235b877544de4cc35111a9f24eaf1b332
~~       PULL_NUMBER: '42'
~~       PULL_BASE_REF: main
~~       REPO_NAME: forge
~~       REPO_OWNER: openshift-psap
~~       PULL_BASE_SHA: 2e20a6a265b879a6b4edbe6c81afe14ca104d9d3
~~       PULL_TITLE: '[llm-d-legacy] Import the TOPSAIL legacy LLM-D project for more advance
~~         testing'
~~       PULL_REFS: main:2e20a6a265b879a6b4edbe6c81afe14ca104d9d3,42:7a0f914235b877544de4cc35111a9f24eaf1b332
~~       PULL_HEAD_REF: llm_d_legacy
~~     status_dest: /logs/artifacts
~~     ci_label: pr42_b2047674267566346240
~~     artifact_dir: /logs/artifacts/001__submit_and_wait
~~ CONTEXT:
~~     final_job_name: forge-llm-d-legacy-20260424-135128
~~     manifest_file: /logs/artifacts/001__submit_and_wait/src/forge-llm-d-legacy-20260424-135128-manifest.yaml
~~
~~ EXCEPTION: RuntimeError
~~     Job forge-llm-d-legacy-20260424-135128 failed: Tasks Completed: 1 (Failed: 1, Cancelled 0), Skipped: 0
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx


[...]

Execution logs

@kpouget
Copy link
Copy Markdown
Contributor Author

kpouget commented Apr 24, 2026

/test fournos llm_d_legacy psap_h200 intelligentrouting-flavors
/cluster athena-fire
/var fournos.namespace: psap-automation-wip

@psap-forge-bot
Copy link
Copy Markdown

🔴 Test of 'llm_d_legacy test' failed after 00 hours 00 minutes 00 seconds 🔴

• Link to the test results.

• No reports index generated...

• No test configuration (variable_overrides.yaml) available.

• Failure indicator: Empty.
Execution logs

@psap-forge-bot
Copy link
Copy Markdown

🔴 Test of 'fournos_launcher submit' failed after 00 hours 00 minutes 31 seconds 🔴

• Link to the test results.

• No reports index generated...

Test configuration:

/test fournos llm_d_legacy psap_h200 intelligentrouting-flavors
/cluster athena-fire
/var fournos.namespace: psap-automation-wip

Failure indicator:

## /logs/artifacts/FAILURE 
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
~~ projects/fournos_launcher/toolbox/submit_and_wait/main.py:169
~~ TASK: wait_for_job_completion: Wait for FOURNOS job to complete
~~ ARTIFACT_DIR: /logs/artifacts/001__submit_and_wait
~~ LOG_FILE: /logs/artifacts/001__submit_and_wait/task.log
~~ ARGS:
~~     cluster_name: athena-fire
~~     project: llm_d_legacy
~~     args:
~~     - psap_h200
~~     - intelligentrouting-flavors
~~     variables_overrides: {}
~~     job_name: ''
~~     namespace: psap-automation-wip
~~     owner: kpouget
~~     display_name: llm_d_legacy psap_h200 intelligentrouting-flavors
~~     pipeline_name: forge-test-only
~~     env:
~~       JOB_TYPE: presubmit
~~       JOB_NAME: pull-ci-openshift-psap-forge-main-fournos
~~       JOB_SPEC: '{"type":"presubmit","job":"pull-ci-openshift-psap-forge-main-fournos","buildid":"2047675114769616896","prowjobid":"aaaa486d-2c83-483c-8b5a-98c22ab0981a","refs":{"org":"openshift-psap","repo":"forge","repo_link":"https://github.com/openshift-psap/forge","base_ref":"main","base_sha":"2e20a6a265b879a6b4edbe6c81afe14ca104d9d3","base_link":"https://github.com/openshift-psap/forge/commit/2e20a6a265b879a6b4edbe6c81afe14ca104d9d3","pulls":[{"number":42,"author":"kpouget","sha":"49e980baf8fddf0b3b8f0b4c844c357dcac3a0d3","title":"[llm-d-legacy]
~~         Import the TOPSAIL legacy LLM-D project for more advance testing","head_ref":"llm_d_legacy","link":"https://github.com/openshift-psap/forge/pull/42","commit_link":"https://github.com/openshift-psap/forge/pull/42/commits/49e980baf8fddf0b3b8f0b4c844c357dcac3a0d3","author_link":"https://github.com/kpouget"}]},"decoration_config":{"timeout":"23h0m0s","grace_period":"15s","utility_images":{"clonerefs":"us-docker.pkg.dev/k8s-infra-prow/images/clonerefs:v20260421-d25a17867","initupload":"us-docker.pkg.dev/k8s-infra-prow/images/initupload:v20260421-d25a17867","entrypoint":"us-docker.pkg.dev/k8s-infra-prow/images/entrypoint:v20260421-d25a17867","sidecar":"us-docker.pkg.dev/k8s-infra-prow/images/sidecar:v20260421-d25a17867"},"resources":{"clonerefs":{"limits":{"memory":"3Gi"},"requests":{"cpu":"100m","memory":"500Mi"}},"initupload":{"limits":{"memory":"200Mi"},"requests":{"cpu":"100m","memory":"50Mi"}},"place_entrypoint":{"limits":{"memory":"100Mi"},"requests":{"cpu":"100m","memory":"25Mi"}},"sidecar":{"limits":{"memory":"2Gi"},"requests":{"cpu":"100m","memory":"250Mi"}}},"gcs_configuration":{"bucket":"test-platform-results","path_strategy":"single","default_org":"openshift","default_repo":"origin","mediaTypes":{"log":"text/plain"},"compress_file_types":["txt","log","json","tar","html","yaml"]},"gcs_credentials_secret":"gce-sa-credentials-gcs-publisher","skip_cloning":true,"censor_secrets":true,"censoring_options":{"minimum_secret_length":6}}}'
~~       OPENSHIFT_CI: 'true'
~~       JOB_NAME_SAFE: fournos
~~       BUILD_ID: '2047675114769616896'
~~       PULL_PULL_SHA: 49e980baf8fddf0b3b8f0b4c844c357dcac3a0d3
~~       PULL_NUMBER: '42'
~~       PULL_BASE_REF: main
~~       REPO_NAME: forge
~~       REPO_OWNER: openshift-psap
~~       PULL_BASE_SHA: 2e20a6a265b879a6b4edbe6c81afe14ca104d9d3
~~       PULL_TITLE: '[llm-d-legacy] Import the TOPSAIL legacy LLM-D project for more advance
~~         testing'
~~       PULL_REFS: main:2e20a6a265b879a6b4edbe6c81afe14ca104d9d3,42:49e980baf8fddf0b3b8f0b4c844c357dcac3a0d3
~~       PULL_HEAD_REF: llm_d_legacy
~~     status_dest: /logs/artifacts
~~     ci_label: pr42_b2047675114769616896
~~     artifact_dir: /logs/artifacts/001__submit_and_wait
~~ CONTEXT:
~~     final_job_name: forge-llm-d-legacy-20260424-135311
~~     manifest_file: /logs/artifacts/001__submit_and_wait/src/forge-llm-d-legacy-20260424-135311-manifest.yaml
~~
~~ EXCEPTION: RuntimeError
~~     Job forge-llm-d-legacy-20260424-135311 failed: Tasks Completed: 1 (Failed: 1, Cancelled 0), Skipped: 0
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx


[...]

Execution logs

@kpouget kpouget force-pushed the llm_d_legacy branch 2 times, most recently from c5124a7 to 770b830 Compare April 24, 2026 13:56
@kpouget
Copy link
Copy Markdown
Contributor Author

kpouget commented Apr 24, 2026

/test fournos llm_d_legacy psap_h200 intelligentrouting-flavors
/cluster athena-fire
/var fournos.namespace: psap-automation-wip

@psap-forge-bot
Copy link
Copy Markdown

🔴 Test of 'llm_d_legacy test' failed after 00 hours 00 minutes 00 seconds 🔴

• Link to the test results.

• No reports index generated...

• No test configuration (variable_overrides.yaml) available.

• Failure indicator: Empty.
Execution logs

@psap-forge-bot
Copy link
Copy Markdown

🔴 Test of 'fournos_launcher submit' failed after 00 hours 00 minutes 32 seconds 🔴

• Link to the test results.

• No reports index generated...

Test configuration:

/test fournos llm_d_legacy psap_h200 intelligentrouting-flavors
/cluster athena-fire
/var fournos.namespace: psap-automation-wip

Failure indicator:

## /logs/artifacts/FAILURE 
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
~~ projects/fournos_launcher/toolbox/submit_and_wait/main.py:169
~~ TASK: wait_for_job_completion: Wait for FOURNOS job to complete
~~ ARTIFACT_DIR: /logs/artifacts/001__submit_and_wait
~~ LOG_FILE: /logs/artifacts/001__submit_and_wait/task.log
~~ ARGS:
~~     cluster_name: athena-fire
~~     project: llm_d_legacy
~~     args:
~~     - psap_h200
~~     - intelligentrouting-flavors
~~     variables_overrides: {}
~~     job_name: ''
~~     namespace: psap-automation-wip
~~     owner: kpouget
~~     display_name: llm_d_legacy psap_h200 intelligentrouting-flavors
~~     pipeline_name: forge-test-only
~~     env:
~~       JOB_TYPE: presubmit
~~       JOB_NAME: pull-ci-openshift-psap-forge-main-fournos
~~       JOB_SPEC: '{"type":"presubmit","job":"pull-ci-openshift-psap-forge-main-fournos","buildid":"2047677193068220416","prowjobid":"148c8082-9a67-402c-86d9-ce7eb82cd1d0","refs":{"org":"openshift-psap","repo":"forge","repo_link":"https://github.com/openshift-psap/forge","base_ref":"main","base_sha":"2e20a6a265b879a6b4edbe6c81afe14ca104d9d3","base_link":"https://github.com/openshift-psap/forge/commit/2e20a6a265b879a6b4edbe6c81afe14ca104d9d3","pulls":[{"number":42,"author":"kpouget","sha":"770b830f5f74b7403db046d810e1cbee980616c5","title":"[llm-d-legacy]
~~         Import the TOPSAIL legacy LLM-D project for more advance testing","head_ref":"llm_d_legacy","link":"https://github.com/openshift-psap/forge/pull/42","commit_link":"https://github.com/openshift-psap/forge/pull/42/commits/770b830f5f74b7403db046d810e1cbee980616c5","author_link":"https://github.com/kpouget"}]},"decoration_config":{"timeout":"23h0m0s","grace_period":"15s","utility_images":{"clonerefs":"us-docker.pkg.dev/k8s-infra-prow/images/clonerefs:v20260421-d25a17867","initupload":"us-docker.pkg.dev/k8s-infra-prow/images/initupload:v20260421-d25a17867","entrypoint":"us-docker.pkg.dev/k8s-infra-prow/images/entrypoint:v20260421-d25a17867","sidecar":"us-docker.pkg.dev/k8s-infra-prow/images/sidecar:v20260421-d25a17867"},"resources":{"clonerefs":{"limits":{"memory":"3Gi"},"requests":{"cpu":"100m","memory":"500Mi"}},"initupload":{"limits":{"memory":"200Mi"},"requests":{"cpu":"100m","memory":"50Mi"}},"place_entrypoint":{"limits":{"memory":"100Mi"},"requests":{"cpu":"100m","memory":"25Mi"}},"sidecar":{"limits":{"memory":"2Gi"},"requests":{"cpu":"100m","memory":"250Mi"}}},"gcs_configuration":{"bucket":"test-platform-results","path_strategy":"single","default_org":"openshift","default_repo":"origin","mediaTypes":{"log":"text/plain"},"compress_file_types":["txt","log","json","tar","html","yaml"]},"gcs_credentials_secret":"gce-sa-credentials-gcs-publisher","skip_cloning":true,"censor_secrets":true,"censoring_options":{"minimum_secret_length":6}}}'
~~       OPENSHIFT_CI: 'true'
~~       JOB_NAME_SAFE: fournos
~~       BUILD_ID: '2047677193068220416'
~~       PULL_PULL_SHA: 770b830f5f74b7403db046d810e1cbee980616c5
~~       PULL_NUMBER: '42'
~~       PULL_BASE_REF: main
~~       REPO_NAME: forge
~~       REPO_OWNER: openshift-psap
~~       PULL_BASE_SHA: 2e20a6a265b879a6b4edbe6c81afe14ca104d9d3
~~       PULL_TITLE: '[llm-d-legacy] Import the TOPSAIL legacy LLM-D project for more advance
~~         testing'
~~       PULL_REFS: main:2e20a6a265b879a6b4edbe6c81afe14ca104d9d3,42:770b830f5f74b7403db046d810e1cbee980616c5
~~       PULL_HEAD_REF: llm_d_legacy
~~     status_dest: /logs/artifacts
~~     ci_label: pr42_b2047677193068220416
~~     artifact_dir: /logs/artifacts/001__submit_and_wait
~~ CONTEXT:
~~     final_job_name: forge-llm-d-legacy-20260424-140128
~~     manifest_file: /logs/artifacts/001__submit_and_wait/src/forge-llm-d-legacy-20260424-140128-manifest.yaml
~~
~~ EXCEPTION: RuntimeError
~~     Job forge-llm-d-legacy-20260424-140128 failed: Tasks Completed: 1 (Failed: 1, Cancelled 0), Skipped: 0
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx


[...]

Execution logs

@kpouget
Copy link
Copy Markdown
Contributor Author

kpouget commented Apr 24, 2026

/test fournos llm_d_legacy psap_h200 intelligentrouting-flavors
/cluster athena-fire
/var fournos.namespace: psap-automation-wip

@psap-forge-bot
Copy link
Copy Markdown

🔴 Test of 'llm_d_legacy test' failed after 00 hours 00 minutes 00 seconds 🔴

• Link to the test results.

• No reports index generated...

• No test configuration (variable_overrides.yaml) available.

• Failure indicator: Empty.
Execution logs

@psap-forge-bot
Copy link
Copy Markdown

🔴 Test of 'fournos_launcher submit' failed after 00 hours 00 minutes 31 seconds 🔴

• Link to the test results.

• No reports index generated...

Test configuration:

/test fournos llm_d_legacy psap_h200 intelligentrouting-flavors
/cluster athena-fire
/var fournos.namespace: psap-automation-wip

Failure indicator:

## /logs/artifacts/FAILURE 
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
~~ projects/fournos_launcher/toolbox/submit_and_wait/main.py:169
~~ TASK: wait_for_job_completion: Wait for FOURNOS job to complete
~~ ARTIFACT_DIR: /logs/artifacts/001__submit_and_wait
~~ LOG_FILE: /logs/artifacts/001__submit_and_wait/task.log
~~ ARGS:
~~     cluster_name: athena-fire
~~     project: llm_d_legacy
~~     args:
~~     - psap_h200
~~     - intelligentrouting-flavors
~~     variables_overrides: {}
~~     job_name: ''
~~     namespace: psap-automation-wip
~~     owner: kpouget
~~     display_name: llm_d_legacy psap_h200 intelligentrouting-flavors
~~     pipeline_name: forge-test-only
~~     env:
~~       JOB_TYPE: presubmit
~~       JOB_NAME: pull-ci-openshift-psap-forge-main-fournos
~~       JOB_SPEC: '{"type":"presubmit","job":"pull-ci-openshift-psap-forge-main-fournos","buildid":"2047678073108697088","prowjobid":"7514adba-e941-43ae-85d3-a98e18d1c777","refs":{"org":"openshift-psap","repo":"forge","repo_link":"https://github.com/openshift-psap/forge","base_ref":"main","base_sha":"2e20a6a265b879a6b4edbe6c81afe14ca104d9d3","base_link":"https://github.com/openshift-psap/forge/commit/2e20a6a265b879a6b4edbe6c81afe14ca104d9d3","pulls":[{"number":42,"author":"kpouget","sha":"71f9482993a0805003b22e8dd239fb47f23c5895","title":"[llm-d-legacy]
~~         Import the TOPSAIL legacy LLM-D project for more advance testing","head_ref":"llm_d_legacy","link":"https://github.com/openshift-psap/forge/pull/42","commit_link":"https://github.com/openshift-psap/forge/pull/42/commits/71f9482993a0805003b22e8dd239fb47f23c5895","author_link":"https://github.com/kpouget"}]},"decoration_config":{"timeout":"23h0m0s","grace_period":"15s","utility_images":{"clonerefs":"us-docker.pkg.dev/k8s-infra-prow/images/clonerefs:v20260421-d25a17867","initupload":"us-docker.pkg.dev/k8s-infra-prow/images/initupload:v20260421-d25a17867","entrypoint":"us-docker.pkg.dev/k8s-infra-prow/images/entrypoint:v20260421-d25a17867","sidecar":"us-docker.pkg.dev/k8s-infra-prow/images/sidecar:v20260421-d25a17867"},"resources":{"clonerefs":{"limits":{"memory":"3Gi"},"requests":{"cpu":"100m","memory":"500Mi"}},"initupload":{"limits":{"memory":"200Mi"},"requests":{"cpu":"100m","memory":"50Mi"}},"place_entrypoint":{"limits":{"memory":"100Mi"},"requests":{"cpu":"100m","memory":"25Mi"}},"sidecar":{"limits":{"memory":"2Gi"},"requests":{"cpu":"100m","memory":"250Mi"}}},"gcs_configuration":{"bucket":"test-platform-results","path_strategy":"single","default_org":"openshift","default_repo":"origin","mediaTypes":{"log":"text/plain"},"compress_file_types":["txt","log","json","tar","html","yaml"]},"gcs_credentials_secret":"gce-sa-credentials-gcs-publisher","skip_cloning":true,"censor_secrets":true,"censoring_options":{"minimum_secret_length":6}}}'
~~       OPENSHIFT_CI: 'true'
~~       JOB_NAME_SAFE: fournos
~~       BUILD_ID: '2047678073108697088'
~~       PULL_PULL_SHA: 71f9482993a0805003b22e8dd239fb47f23c5895
~~       PULL_NUMBER: '42'
~~       PULL_BASE_REF: main
~~       REPO_NAME: forge
~~       REPO_OWNER: openshift-psap
~~       PULL_BASE_SHA: 2e20a6a265b879a6b4edbe6c81afe14ca104d9d3
~~       PULL_TITLE: '[llm-d-legacy] Import the TOPSAIL legacy LLM-D project for more advance
~~         testing'
~~       PULL_REFS: main:2e20a6a265b879a6b4edbe6c81afe14ca104d9d3,42:71f9482993a0805003b22e8dd239fb47f23c5895
~~       PULL_HEAD_REF: llm_d_legacy
~~     status_dest: /logs/artifacts
~~     ci_label: pr42_b2047678073108697088
~~     artifact_dir: /logs/artifacts/001__submit_and_wait
~~ CONTEXT:
~~     final_job_name: forge-llm-d-legacy-20260424-140518
~~     manifest_file: /logs/artifacts/001__submit_and_wait/src/forge-llm-d-legacy-20260424-140518-manifest.yaml
~~
~~ EXCEPTION: RuntimeError
~~     Job forge-llm-d-legacy-20260424-140518 failed: Tasks Completed: 1 (Failed: 1, Cancelled 0), Skipped: 0
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx


[...]

Execution logs

@kpouget
Copy link
Copy Markdown
Contributor Author

kpouget commented Apr 24, 2026

/test fournos llm_d_legacy psap_h200 intelligentrouting-flavors
/cluster athena-fire
/var fournos.namespace: psap-automation-wip

@psap-forge-bot
Copy link
Copy Markdown

🔴 Test of 'llm_d_legacy test' failed after 00 hours 00 minutes 00 seconds 🔴

• Link to the test results.

• No reports index generated...

• No test configuration (variable_overrides.yaml) available.

Failure indicator:

## /tmp/artifacts/FAILURE 
--- 📍NameError STACKTRACE ---
--- 📍name 'logging' is not defined

   Traceback (most recent call last):
     File "/app/forge/projects/llm_d_legacy/orchestration/ci.py", line 67, in test
       presets = config.project.get_config("project.args")
                 ^^^^^^^^^^^^^^^^^^^^^^^^^
   AttributeError: 'NoneType' object has no attribute 'get_config'
   
   During handling of the above exception, another exception occurred:
   
   Traceback (most recent call last):
     File "/app/forge/projects/core/library/ci.py", line 100, in wrapper
       exit_code = command_func(*args, **kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "/app/forge/projects/llm_d_legacy/orchestration/ci.py", line 74, in test
       logging.info("[llm-d] running the Caliper export")
       ^^^^^^^
   NameError: name 'logging' is not defined. Did you forget to import 'logging'

[...]

Execution logs

@psap-forge-bot
Copy link
Copy Markdown

🔴 Test of 'fournos_launcher submit' failed after 00 hours 00 minutes 31 seconds 🔴

• Link to the test results.

• No reports index generated...

Test configuration:

/test fournos llm_d_legacy psap_h200 intelligentrouting-flavors
/cluster athena-fire
/var fournos.namespace: psap-automation-wip

Failure indicator:

## /logs/artifacts/FAILURE 
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
~~ projects/fournos_launcher/toolbox/submit_and_wait/main.py:169
~~ TASK: wait_for_job_completion: Wait for FOURNOS job to complete
~~ ARTIFACT_DIR: /logs/artifacts/001__submit_and_wait
~~ LOG_FILE: /logs/artifacts/001__submit_and_wait/task.log
~~ ARGS:
~~     cluster_name: athena-fire
~~     project: llm_d_legacy
~~     args:
~~     - psap_h200
~~     - intelligentrouting-flavors
~~     variables_overrides: {}
~~     job_name: ''
~~     namespace: psap-automation-wip
~~     owner: kpouget
~~     display_name: llm_d_legacy psap_h200 intelligentrouting-flavors
~~     pipeline_name: forge-test-only
~~     env:
~~       JOB_TYPE: presubmit
~~       JOB_NAME: pull-ci-openshift-psap-forge-main-fournos
~~       JOB_SPEC: '{"type":"presubmit","job":"pull-ci-openshift-psap-forge-main-fournos","buildid":"2047680262866735104","prowjobid":"7423149d-e87d-4e3d-9015-94399cb265f5","refs":{"org":"openshift-psap","repo":"forge","repo_link":"https://github.com/openshift-psap/forge","base_ref":"main","base_sha":"2e20a6a265b879a6b4edbe6c81afe14ca104d9d3","base_link":"https://github.com/openshift-psap/forge/commit/2e20a6a265b879a6b4edbe6c81afe14ca104d9d3","pulls":[{"number":42,"author":"kpouget","sha":"648b23ffb980143f56aff9f438baf165b9f6f9e9","title":"[llm-d-legacy]
~~         Import the TOPSAIL legacy LLM-D project for more advance testing","head_ref":"llm_d_legacy","link":"https://github.com/openshift-psap/forge/pull/42","commit_link":"https://github.com/openshift-psap/forge/pull/42/commits/648b23ffb980143f56aff9f438baf165b9f6f9e9","author_link":"https://github.com/kpouget"}]},"decoration_config":{"timeout":"23h0m0s","grace_period":"15s","utility_images":{"clonerefs":"us-docker.pkg.dev/k8s-infra-prow/images/clonerefs:v20260421-d25a17867","initupload":"us-docker.pkg.dev/k8s-infra-prow/images/initupload:v20260421-d25a17867","entrypoint":"us-docker.pkg.dev/k8s-infra-prow/images/entrypoint:v20260421-d25a17867","sidecar":"us-docker.pkg.dev/k8s-infra-prow/images/sidecar:v20260421-d25a17867"},"resources":{"clonerefs":{"limits":{"memory":"3Gi"},"requests":{"cpu":"100m","memory":"500Mi"}},"initupload":{"limits":{"memory":"200Mi"},"requests":{"cpu":"100m","memory":"50Mi"}},"place_entrypoint":{"limits":{"memory":"100Mi"},"requests":{"cpu":"100m","memory":"25Mi"}},"sidecar":{"limits":{"memory":"2Gi"},"requests":{"cpu":"100m","memory":"250Mi"}}},"gcs_configuration":{"bucket":"test-platform-results","path_strategy":"single","default_org":"openshift","default_repo":"origin","mediaTypes":{"log":"text/plain"},"compress_file_types":["txt","log","json","tar","html","yaml"]},"gcs_credentials_secret":"gce-sa-credentials-gcs-publisher","skip_cloning":true,"censor_secrets":true,"censoring_options":{"minimum_secret_length":6}}}'
~~       OPENSHIFT_CI: 'true'
~~       JOB_NAME_SAFE: fournos
~~       BUILD_ID: '2047680262866735104'
~~       PULL_PULL_SHA: 648b23ffb980143f56aff9f438baf165b9f6f9e9
~~       PULL_NUMBER: '42'
~~       PULL_BASE_REF: main
~~       REPO_NAME: forge
~~       REPO_OWNER: openshift-psap
~~       PULL_BASE_SHA: 2e20a6a265b879a6b4edbe6c81afe14ca104d9d3
~~       PULL_TITLE: '[llm-d-legacy] Import the TOPSAIL legacy LLM-D project for more advance
~~         testing'
~~       PULL_REFS: main:2e20a6a265b879a6b4edbe6c81afe14ca104d9d3,42:648b23ffb980143f56aff9f438baf165b9f6f9e9
~~       PULL_HEAD_REF: llm_d_legacy
~~     status_dest: /logs/artifacts
~~     ci_label: pr42_b2047680262866735104
~~     artifact_dir: /logs/artifacts/001__submit_and_wait
~~ CONTEXT:
~~     final_job_name: forge-llm-d-legacy-20260424-141336
~~     manifest_file: /logs/artifacts/001__submit_and_wait/src/forge-llm-d-legacy-20260424-141336-manifest.yaml
~~
~~ EXCEPTION: RuntimeError
~~     Job forge-llm-d-legacy-20260424-141336 failed: Tasks Completed: 1 (Failed: 1, Cancelled 0), Skipped: 0
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx


[...]

Execution logs

@kpouget
Copy link
Copy Markdown
Contributor Author

kpouget commented Apr 24, 2026

/test fournos llm_d_legacy psap_h200 intelligentrouting-flavors
/cluster athena-fire
/var fournos.namespace: psap-automation-wip

@psap-forge-bot
Copy link
Copy Markdown

🔴 Test of 'llm_d_legacy test' failed after 00 hours 00 minutes 00 seconds 🔴

• Link to the test results.

• No reports index generated...

• No test configuration (variable_overrides.yaml) available.

Failure indicator:

## /tmp/artifacts/FAILURE 
--- 📍AttributeError STACKTRACE ---
--- 📍'NoneType' object has no attribute 'set_config'

   Traceback (most recent call last):
     File "/app/forge/projects/llm_d_legacy/orchestration/ci.py", line 68, in test
       config.init()
   TypeError: init() missing 1 required positional argument: 'testing_dir'
   
   During handling of the above exception, another exception occurred:
   
   Traceback (most recent call last):
     File "/app/forge/projects/core/library/ci.py", line 100, in wrapper
       exit_code = command_func(*args, **kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "/app/forge/projects/llm_d_legacy/orchestration/ci.py", line 77, in test
       _caliper_export_at_end()
     File "/app/forge/projects/llm_d_legacy/orchestration/ci.py", line 23, in _caliper_export_at_end
       config.project.set_config("caliper.export.from", str(root), print=False)
       ^^^^^^^^^^^^^^^^^^^^^^^^^
   AttributeError: 'NoneType' object has no attribute 'set_config'

[...]

Execution logs

@psap-forge-bot
Copy link
Copy Markdown

🔴 Test of 'fournos_launcher submit' failed after 00 hours 07 minutes 32 seconds 🔴

• Link to the test results.

• No reports index generated...

Test configuration:

/test fournos llm_d_legacy psap_h200 intelligentrouting-flavors
/cluster athena-fire
/var fournos.namespace: psap-automation-wip

Failure indicator:

## /logs/artifacts/FAILURE 
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
~~ projects/fournos_launcher/toolbox/submit_and_wait/main.py:169
~~ TASK: wait_for_job_completion: Wait for FOURNOS job to complete
~~ ARTIFACT_DIR: /logs/artifacts/001__submit_and_wait
~~ LOG_FILE: /logs/artifacts/001__submit_and_wait/task.log
~~ ARGS:
~~     cluster_name: athena-fire
~~     project: llm_d_legacy
~~     args:
~~     - psap_h200
~~     - intelligentrouting-flavors
~~     variables_overrides: {}
~~     job_name: ''
~~     namespace: psap-automation-wip
~~     owner: kpouget
~~     display_name: llm_d_legacy psap_h200 intelligentrouting-flavors
~~     pipeline_name: forge-test-only
~~     env:
~~       JOB_TYPE: presubmit
~~       JOB_NAME: pull-ci-openshift-psap-forge-main-fournos
~~       JOB_SPEC: '{"type":"presubmit","job":"pull-ci-openshift-psap-forge-main-fournos","buildid":"2047689832586547200","prowjobid":"6cad85bb-8094-4b27-9176-001e1c54221a","refs":{"org":"openshift-psap","repo":"forge","repo_link":"https://github.com/openshift-psap/forge","base_ref":"main","base_sha":"2e20a6a265b879a6b4edbe6c81afe14ca104d9d3","base_link":"https://github.com/openshift-psap/forge/commit/2e20a6a265b879a6b4edbe6c81afe14ca104d9d3","pulls":[{"number":42,"author":"kpouget","sha":"b10342778824c0dacd42d584980f51fd2cad43c8","title":"[llm-d-legacy]
~~         Import the TOPSAIL legacy LLM-D project for more advance testing","head_ref":"llm_d_legacy","link":"https://github.com/openshift-psap/forge/pull/42","commit_link":"https://github.com/openshift-psap/forge/pull/42/commits/b10342778824c0dacd42d584980f51fd2cad43c8","author_link":"https://github.com/kpouget"}]},"decoration_config":{"timeout":"23h0m0s","grace_period":"15s","utility_images":{"clonerefs":"us-docker.pkg.dev/k8s-infra-prow/images/clonerefs:v20260421-d25a17867","initupload":"us-docker.pkg.dev/k8s-infra-prow/images/initupload:v20260421-d25a17867","entrypoint":"us-docker.pkg.dev/k8s-infra-prow/images/entrypoint:v20260421-d25a17867","sidecar":"us-docker.pkg.dev/k8s-infra-prow/images/sidecar:v20260421-d25a17867"},"resources":{"clonerefs":{"limits":{"memory":"3Gi"},"requests":{"cpu":"100m","memory":"500Mi"}},"initupload":{"limits":{"memory":"200Mi"},"requests":{"cpu":"100m","memory":"50Mi"}},"place_entrypoint":{"limits":{"memory":"100Mi"},"requests":{"cpu":"100m","memory":"25Mi"}},"sidecar":{"limits":{"memory":"2Gi"},"requests":{"cpu":"100m","memory":"250Mi"}}},"gcs_configuration":{"bucket":"test-platform-results","path_strategy":"single","default_org":"openshift","default_repo":"origin","mediaTypes":{"log":"text/plain"},"compress_file_types":["txt","log","json","tar","html","yaml"]},"gcs_credentials_secret":"gce-sa-credentials-gcs-publisher","skip_cloning":true,"censor_secrets":true,"censoring_options":{"minimum_secret_length":6}}}'
~~       OPENSHIFT_CI: 'true'
~~       JOB_NAME_SAFE: fournos
~~       BUILD_ID: '2047689832586547200'
~~       PULL_PULL_SHA: b10342778824c0dacd42d584980f51fd2cad43c8
~~       PULL_NUMBER: '42'
~~       PULL_BASE_REF: main
~~       REPO_NAME: forge
~~       REPO_OWNER: openshift-psap
~~       PULL_BASE_SHA: 2e20a6a265b879a6b4edbe6c81afe14ca104d9d3
~~       PULL_TITLE: '[llm-d-legacy] Import the TOPSAIL legacy LLM-D project for more advance
~~         testing'
~~       PULL_REFS: main:2e20a6a265b879a6b4edbe6c81afe14ca104d9d3,42:b10342778824c0dacd42d584980f51fd2cad43c8
~~       PULL_HEAD_REF: llm_d_legacy
~~     status_dest: /logs/artifacts
~~     ci_label: pr42_b2047689832586547200
~~     artifact_dir: /logs/artifacts/001__submit_and_wait
~~ CONTEXT:
~~     final_job_name: forge-llm-d-legacy-20260424-145154
~~     manifest_file: /logs/artifacts/001__submit_and_wait/src/forge-llm-d-legacy-20260424-145154-manifest.yaml
~~
~~ EXCEPTION: RuntimeError
~~     Job forge-llm-d-legacy-20260424-145154 failed: Tasks Completed: 1 (Failed: 1, Cancelled 0), Skipped: 0
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx


[...]

Execution logs

@kpouget
Copy link
Copy Markdown
Contributor Author

kpouget commented Apr 24, 2026

/test fournos llm_d_legacy psap_h200 intelligentrouting-flavors
/cluster athena-fire
/var fournos.namespace: psap-automation-wip

1 similar comment
@kpouget
Copy link
Copy Markdown
Contributor Author

kpouget commented Apr 24, 2026

/test fournos llm_d_legacy psap_h200 intelligentrouting-flavors
/cluster athena-fire
/var fournos.namespace: psap-automation-wip

@psap-forge-bot
Copy link
Copy Markdown

🔴 Test of 'llm_d_legacy test' failed after 00 hours 07 minutes 56 seconds 🔴

• Link to the test results.

• No reports index generated...

• No test configuration (variable_overrides.yaml) available.

Failure indicator:

## /tmp/artifacts/002__plots/FAILURE 
An error happened during the visualization post-processing ... (2_matbench_generate_lts_schema.log, 2_matbench_generate_lts_schema.log, 3_matbench_visualize.log in /tmp/artifacts/002__plots). The test that was processed SUCCEEDED.

## /tmp/artifacts/FAILURE 
--- 📍RuntimeError STACKTRACE ---
--- 📍An error happened during the visualization post-processing ... (2_matbench_generate_lts_schema.log, 2_matbench_generate_lts_schema.log, 3_matbench_visualize.log in /tmp/artifacts/002__plots). The test that was processed SUCCEEDED.

   Traceback (most recent call last):
     File "/app/forge/projects/core/library/ci.py", line 100, in wrapper
       exit_code = command_func(*args, **kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "/app/forge/projects/llm_d_legacy/orchestration/ci.py", line 85, in test
       failed = test_llmd.test()
                ^^^^^^^^^^^^^^^^
     File "/app/forge/projects/llm_d_legacy/testing/test_llmd.py", line 181, in test
       raise exc
     File "/app/forge/projects/legacy/library/run.py", line 269, in run_and_catch
       fct(*args, **kwargs)
     File "/app/forge/projects/matrix_benchmarking/library/visualize.py", line 85, in wrapper
       fct(*args, **kwargs)
     File "/app/forge/projects/matrix_benchmarking/library/visualize.py", line 555, in generate_from_dir
       generate_visualizations(results_dirname, generate_lts=generate_lts, test_failed=test_failed)
     File "/app/forge/projects/matrix_benchmarking/library/visualize.py", line 85, in wrapper
       fct(*args, **kwargs)
     File "/app/forge/projects/matrix_benchmarking/library/visualize.py", line 104, in generate_visualizations
       generate_visualization(
     File "/app/forge/projects/matrix_benchmarking/library/visualize.py", line 548, in generate_visualization
       raise RuntimeError(msg)
   RuntimeError: An error happened during the visualization post-processing ... (2_matbench_generate_lts_schema.log, 2_matbench_generate_lts_schema.log, 3_matbench_visualize.log in /tmp/artifacts/002__plots). The test that was processed SUCCEEDED.

[...]

Execution logs

@psap-forge-bot
Copy link
Copy Markdown

🔴 Test of 'fournos_launcher submit' failed after 00 hours 08 minutes 26 seconds 🔴

• Link to the test results.

• No reports index generated...

Test configuration:

/test fournos llm_d_legacy psap_h200 intelligentrouting-flavors
/cluster athena-fire
/var fournos.namespace: psap-automation-wip

Failure indicator:

## /logs/artifacts/FAILURE 
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
~~ projects/fournos_launcher/toolbox/submit_and_wait/main.py:169
~~ TASK: wait_for_job_completion: Wait for FOURNOS job to complete
~~ ARTIFACT_DIR: /logs/artifacts/001__submit_and_wait
~~ LOG_FILE: /logs/artifacts/001__submit_and_wait/task.log
~~ ARGS:
~~     cluster_name: athena-fire
~~     project: llm_d_legacy
~~     args:
~~     - psap_h200
~~     - intelligentrouting-flavors
~~     variables_overrides: {}
~~     job_name: ''
~~     namespace: psap-automation-wip
~~     owner: kpouget
~~     display_name: llm_d_legacy psap_h200 intelligentrouting-flavors
~~     pipeline_name: forge-test-only
~~     env:
~~       JOB_TYPE: presubmit
~~       JOB_NAME: pull-ci-openshift-psap-forge-main-fournos
~~       JOB_SPEC: '{"type":"presubmit","job":"pull-ci-openshift-psap-forge-main-fournos","buildid":"2047696839729221632","prowjobid":"5624893d-4081-4a07-8ec0-b8b729b3cf33","refs":{"org":"openshift-psap","repo":"forge","repo_link":"https://github.com/openshift-psap/forge","base_ref":"main","base_sha":"2e20a6a265b879a6b4edbe6c81afe14ca104d9d3","base_link":"https://github.com/openshift-psap/forge/commit/2e20a6a265b879a6b4edbe6c81afe14ca104d9d3","pulls":[{"number":42,"author":"kpouget","sha":"44bfcbb0cab626c03cd24f029844d3d1997165ed","title":"[llm-d-legacy]
~~         Import the TOPSAIL legacy LLM-D project for more advance testing","head_ref":"llm_d_legacy","link":"https://github.com/openshift-psap/forge/pull/42","commit_link":"https://github.com/openshift-psap/forge/pull/42/commits/44bfcbb0cab626c03cd24f029844d3d1997165ed","author_link":"https://github.com/kpouget"}]},"decoration_config":{"timeout":"23h0m0s","grace_period":"15s","utility_images":{"clonerefs":"us-docker.pkg.dev/k8s-infra-prow/images/clonerefs:v20260421-d25a17867","initupload":"us-docker.pkg.dev/k8s-infra-prow/images/initupload:v20260421-d25a17867","entrypoint":"us-docker.pkg.dev/k8s-infra-prow/images/entrypoint:v20260421-d25a17867","sidecar":"us-docker.pkg.dev/k8s-infra-prow/images/sidecar:v20260421-d25a17867"},"resources":{"clonerefs":{"limits":{"memory":"3Gi"},"requests":{"cpu":"100m","memory":"500Mi"}},"initupload":{"limits":{"memory":"200Mi"},"requests":{"cpu":"100m","memory":"50Mi"}},"place_entrypoint":{"limits":{"memory":"100Mi"},"requests":{"cpu":"100m","memory":"25Mi"}},"sidecar":{"limits":{"memory":"2Gi"},"requests":{"cpu":"100m","memory":"250Mi"}}},"gcs_configuration":{"bucket":"test-platform-results","path_strategy":"single","default_org":"openshift","default_repo":"origin","mediaTypes":{"log":"text/plain"},"compress_file_types":["txt","log","json","tar","html","yaml"]},"gcs_credentials_secret":"gce-sa-credentials-gcs-publisher","skip_cloning":true,"censor_secrets":true,"censoring_options":{"minimum_secret_length":6}}}'
~~       OPENSHIFT_CI: 'true'
~~       JOB_NAME_SAFE: fournos
~~       BUILD_ID: '2047696839729221632'
~~       PULL_PULL_SHA: 44bfcbb0cab626c03cd24f029844d3d1997165ed
~~       PULL_NUMBER: '42'
~~       PULL_BASE_REF: main
~~       REPO_NAME: forge
~~       REPO_OWNER: openshift-psap
~~       PULL_BASE_SHA: 2e20a6a265b879a6b4edbe6c81afe14ca104d9d3
~~       PULL_TITLE: '[llm-d-legacy] Import the TOPSAIL legacy LLM-D project for more advance
~~         testing'
~~       PULL_REFS: main:2e20a6a265b879a6b4edbe6c81afe14ca104d9d3,42:44bfcbb0cab626c03cd24f029844d3d1997165ed
~~       PULL_HEAD_REF: llm_d_legacy
~~     status_dest: /logs/artifacts
~~     ci_label: pr42_b2047696839729221632
~~     artifact_dir: /logs/artifacts/001__submit_and_wait
~~ CONTEXT:
~~     final_job_name: forge-llm-d-legacy-20260424-151927
~~     manifest_file: /logs/artifacts/001__submit_and_wait/src/forge-llm-d-legacy-20260424-151927-manifest.yaml
~~
~~ EXCEPTION: RuntimeError
~~     Job forge-llm-d-legacy-20260424-151927 failed: Tasks Completed: 1 (Failed: 1, Cancelled 0), Skipped: 0
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx


[...]

Execution logs

@kpouget
Copy link
Copy Markdown
Contributor Author

kpouget commented Apr 24, 2026

/test fournos llm_d_legacy psap_h200 intelligentrouting-flavors
/cluster athena-fire
/var fournos.namespace: psap-automation-wip

@psap-forge-bot
Copy link
Copy Markdown

🔴 Test of 'llm_d_legacy test' failed after 00 hours 07 minutes 10 seconds 🔴

• Link to the test results.

• No reports index generated...

• No test configuration (variable_overrides.yaml) available.

Failure indicator:

## /tmp/artifacts/002__plots/FAILURE 
An error happened during the visualization post-processing ... (2_matbench_generate_lts_schema.log, 2_matbench_generate_lts_schema.log, 3_matbench_visualize.log in /tmp/artifacts/002__plots). The test that was processed SUCCEEDED.

## /tmp/artifacts/FAILURE 
--- 📍RuntimeError STACKTRACE ---
--- 📍An error happened during the visualization post-processing ... (2_matbench_generate_lts_schema.log, 2_matbench_generate_lts_schema.log, 3_matbench_visualize.log in /tmp/artifacts/002__plots). The test that was processed SUCCEEDED.

   Traceback (most recent call last):
     File "/app/forge/projects/core/library/ci.py", line 100, in wrapper
       exit_code = command_func(*args, **kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "/app/forge/projects/llm_d_legacy/orchestration/ci.py", line 85, in test
       failed = test_llmd.test()
                ^^^^^^^^^^^^^^^^
     File "/app/forge/projects/llm_d_legacy/testing/test_llmd.py", line 181, in test
       raise exc
     File "/app/forge/projects/legacy/library/run.py", line 269, in run_and_catch
       fct(*args, **kwargs)
     File "/app/forge/projects/matrix_benchmarking/library/visualize.py", line 85, in wrapper
       fct(*args, **kwargs)
     File "/app/forge/projects/matrix_benchmarking/library/visualize.py", line 555, in generate_from_dir
       generate_visualizations(results_dirname, generate_lts=generate_lts, test_failed=test_failed)
     File "/app/forge/projects/matrix_benchmarking/library/visualize.py", line 85, in wrapper
       fct(*args, **kwargs)
     File "/app/forge/projects/matrix_benchmarking/library/visualize.py", line 104, in generate_visualizations
       generate_visualization(
     File "/app/forge/projects/matrix_benchmarking/library/visualize.py", line 548, in generate_visualization
       raise RuntimeError(msg)
   RuntimeError: An error happened during the visualization post-processing ... (2_matbench_generate_lts_schema.log, 2_matbench_generate_lts_schema.log, 3_matbench_visualize.log in /tmp/artifacts/002__plots). The test that was processed SUCCEEDED.

[...]

Execution logs

@psap-forge-bot
Copy link
Copy Markdown

🔴 Test of 'fournos_launcher submit' failed after 00 hours 07 minutes 45 seconds 🔴

• Link to the test results.

• No reports index generated...

Test configuration:

/test fournos llm_d_legacy psap_h200 intelligentrouting-flavors
/cluster athena-fire
/var fournos.namespace: psap-automation-wip

Failure indicator:

## /logs/artifacts/FAILURE 
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
~~ projects/fournos_launcher/toolbox/submit_and_wait/main.py:169
~~ TASK: wait_for_job_completion: Wait for FOURNOS job to complete
~~ ARTIFACT_DIR: /logs/artifacts/001__submit_and_wait
~~ LOG_FILE: /logs/artifacts/001__submit_and_wait/task.log
~~ ARGS:
~~     cluster_name: athena-fire
~~     project: llm_d_legacy
~~     args:
~~     - psap_h200
~~     - intelligentrouting-flavors
~~     variables_overrides: {}
~~     job_name: ''
~~     namespace: psap-automation-wip
~~     owner: kpouget
~~     display_name: llm_d_legacy psap_h200 intelligentrouting-flavors
~~     pipeline_name: forge-test-only
~~     env:
~~       JOB_TYPE: presubmit
~~       JOB_NAME: pull-ci-openshift-psap-forge-main-fournos
~~       JOB_SPEC: '{"type":"presubmit","job":"pull-ci-openshift-psap-forge-main-fournos","buildid":"2047699546242289664","prowjobid":"99c8f0b2-7670-4d0f-89a6-d3d6fbe329e5","refs":{"org":"openshift-psap","repo":"forge","repo_link":"https://github.com/openshift-psap/forge","base_ref":"main","base_sha":"2e20a6a265b879a6b4edbe6c81afe14ca104d9d3","base_link":"https://github.com/openshift-psap/forge/commit/2e20a6a265b879a6b4edbe6c81afe14ca104d9d3","pulls":[{"number":42,"author":"kpouget","sha":"bb8827ddfde71db4369d587fde807cee67162d13","title":"[llm-d-legacy]
~~         Import the TOPSAIL legacy LLM-D project for more advance testing","head_ref":"llm_d_legacy","link":"https://github.com/openshift-psap/forge/pull/42","commit_link":"https://github.com/openshift-psap/forge/pull/42/commits/bb8827ddfde71db4369d587fde807cee67162d13","author_link":"https://github.com/kpouget"}]},"decoration_config":{"timeout":"23h0m0s","grace_period":"15s","utility_images":{"clonerefs":"us-docker.pkg.dev/k8s-infra-prow/images/clonerefs:v20260421-d25a17867","initupload":"us-docker.pkg.dev/k8s-infra-prow/images/initupload:v20260421-d25a17867","entrypoint":"us-docker.pkg.dev/k8s-infra-prow/images/entrypoint:v20260421-d25a17867","sidecar":"us-docker.pkg.dev/k8s-infra-prow/images/sidecar:v20260421-d25a17867"},"resources":{"clonerefs":{"limits":{"memory":"3Gi"},"requests":{"cpu":"100m","memory":"500Mi"}},"initupload":{"limits":{"memory":"200Mi"},"requests":{"cpu":"100m","memory":"50Mi"}},"place_entrypoint":{"limits":{"memory":"100Mi"},"requests":{"cpu":"100m","memory":"25Mi"}},"sidecar":{"limits":{"memory":"2Gi"},"requests":{"cpu":"100m","memory":"250Mi"}}},"gcs_configuration":{"bucket":"test-platform-results","path_strategy":"single","default_org":"openshift","default_repo":"origin","mediaTypes":{"log":"text/plain"},"compress_file_types":["txt","log","json","tar","html","yaml"]},"gcs_credentials_secret":"gce-sa-credentials-gcs-publisher","skip_cloning":true,"censor_secrets":true,"censoring_options":{"minimum_secret_length":6}}}'
~~       OPENSHIFT_CI: 'true'
~~       JOB_NAME_SAFE: fournos
~~       BUILD_ID: '2047699546242289664'
~~       PULL_PULL_SHA: bb8827ddfde71db4369d587fde807cee67162d13
~~       PULL_NUMBER: '42'
~~       PULL_BASE_REF: main
~~       REPO_NAME: forge
~~       REPO_OWNER: openshift-psap
~~       PULL_BASE_SHA: 2e20a6a265b879a6b4edbe6c81afe14ca104d9d3
~~       PULL_TITLE: '[llm-d-legacy] Import the TOPSAIL legacy LLM-D project for more advance
~~         testing'
~~       PULL_REFS: main:2e20a6a265b879a6b4edbe6c81afe14ca104d9d3,42:bb8827ddfde71db4369d587fde807cee67162d13
~~       PULL_HEAD_REF: llm_d_legacy
~~     status_dest: /logs/artifacts
~~     ci_label: pr42_b2047699546242289664
~~     artifact_dir: /logs/artifacts/001__submit_and_wait
~~ CONTEXT:
~~     final_job_name: forge-llm-d-legacy-20260424-153031
~~     manifest_file: /logs/artifacts/001__submit_and_wait/src/forge-llm-d-legacy-20260424-153031-manifest.yaml
~~
~~ EXCEPTION: RuntimeError
~~     Job forge-llm-d-legacy-20260424-153031 failed: Tasks Completed: 1 (Failed: 1, Cancelled 0), Skipped: 0
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx


[...]

Execution logs

@kpouget kpouget force-pushed the llm_d_legacy branch 3 times, most recently from fbc2a21 to e4793a5 Compare April 24, 2026 16:06
@kpouget
Copy link
Copy Markdown
Contributor Author

kpouget commented Apr 24, 2026

/test fournos llm_d_legacy psap_h200 intelligentrouting-flavors
/cluster athena-fire
/var fournos.namespace: psap-automation-wip

@psap-forge-bot
Copy link
Copy Markdown

🔴 Test of 'llm_d_legacy test' failed after 00 hours 07 minutes 10 seconds 🔴

• Link to the test results.

• No reports index generated...

• No test configuration (variable_overrides.yaml) available.

Failure indicator:

## /tmp/artifacts/002__plots/FAILURE 
An error happened during the visualization post-processing ... (2_matbench_generate_lts_schema.log, 2_matbench_generate_lts_schema.log in /tmp/artifacts/002__plots). The test that was processed SUCCEEDED.

## /tmp/artifacts/FAILURE 
--- 📍RuntimeError STACKTRACE ---
--- 📍An error happened during the visualization post-processing ... (2_matbench_generate_lts_schema.log, 2_matbench_generate_lts_schema.log in /tmp/artifacts/002__plots). The test that was processed SUCCEEDED.

   Traceback (most recent call last):
     File "/app/forge/projects/core/library/ci.py", line 100, in wrapper
       exit_code = command_func(*args, **kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "/app/forge/projects/llm_d_legacy/orchestration/ci.py", line 85, in test
       failed = test_llmd.test()
                ^^^^^^^^^^^^^^^^
     File "/app/forge/projects/llm_d_legacy/testing/test_llmd.py", line 181, in test
       raise exc
     File "/app/forge/projects/legacy/library/run.py", line 269, in run_and_catch
       fct(*args, **kwargs)
     File "/app/forge/projects/matrix_benchmarking/library/visualize.py", line 85, in wrapper
       fct(*args, **kwargs)
     File "/app/forge/projects/matrix_benchmarking/library/visualize.py", line 555, in generate_from_dir
       generate_visualizations(results_dirname, generate_lts=generate_lts, test_failed=test_failed)
     File "/app/forge/projects/matrix_benchmarking/library/visualize.py", line 85, in wrapper
       fct(*args, **kwargs)
     File "/app/forge/projects/matrix_benchmarking/library/visualize.py", line 104, in generate_visualizations
       generate_visualization(
     File "/app/forge/projects/matrix_benchmarking/library/visualize.py", line 548, in generate_visualization
       raise RuntimeError(msg)
   RuntimeError: An error happened during the visualization post-processing ... (2_matbench_generate_lts_schema.log, 2_matbench_generate_lts_schema.log in /tmp/artifacts/002__plots). The test that was processed SUCCEEDED.

[...]

Execution logs

@psap-forge-bot
Copy link
Copy Markdown

🔴 Test of 'fournos_launcher submit' failed after 00 hours 09 minutes 01 seconds 🔴

• Link to the test results.

• No reports index generated...

Test configuration:

/test fournos llm_d_legacy psap_h200 intelligentrouting-flavors
/cluster athena-fire
/var fournos.namespace: psap-automation-wip

Failure indicator:

## /logs/artifacts/FAILURE 
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
~~ projects/fournos_launcher/toolbox/submit_and_wait/main.py:169
~~ TASK: wait_for_job_completion: Wait for FOURNOS job to complete
~~ ARTIFACT_DIR: /logs/artifacts/001__submit_and_wait
~~ LOG_FILE: /logs/artifacts/001__submit_and_wait/task.log
~~ ARGS:
~~     cluster_name: athena-fire
~~     project: llm_d_legacy
~~     args:
~~     - psap_h200
~~     - intelligentrouting-flavors
~~     variables_overrides: {}
~~     job_name: ''
~~     namespace: psap-automation-wip
~~     owner: kpouget
~~     display_name: llm_d_legacy psap_h200 intelligentrouting-flavors
~~     pipeline_name: forge-test-only
~~     env:
~~       JOB_TYPE: presubmit
~~       JOB_NAME: pull-ci-openshift-psap-forge-main-fournos
~~       JOB_SPEC: '{"type":"presubmit","job":"pull-ci-openshift-psap-forge-main-fournos","buildid":"2047708748796923904","prowjobid":"c07671d7-2505-489a-80a6-f2567bd2680f","refs":{"org":"openshift-psap","repo":"forge","repo_link":"https://github.com/openshift-psap/forge","base_ref":"main","base_sha":"42ad7b39f9de234ec569dc9583ffc70320bf264f","base_link":"https://github.com/openshift-psap/forge/commit/42ad7b39f9de234ec569dc9583ffc70320bf264f","pulls":[{"number":42,"author":"kpouget","sha":"e4793a59f7e2e43f59e4c556b9e991d59ab66043","title":"[llm-d-legacy]
~~         Import the TOPSAIL legacy LLM-D project for more advance testing","head_ref":"llm_d_legacy","link":"https://github.com/openshift-psap/forge/pull/42","commit_link":"https://github.com/openshift-psap/forge/pull/42/commits/e4793a59f7e2e43f59e4c556b9e991d59ab66043","author_link":"https://github.com/kpouget"}]},"decoration_config":{"timeout":"23h0m0s","grace_period":"15s","utility_images":{"clonerefs":"us-docker.pkg.dev/k8s-infra-prow/images/clonerefs:v20260421-d25a17867","initupload":"us-docker.pkg.dev/k8s-infra-prow/images/initupload:v20260421-d25a17867","entrypoint":"us-docker.pkg.dev/k8s-infra-prow/images/entrypoint:v20260421-d25a17867","sidecar":"us-docker.pkg.dev/k8s-infra-prow/images/sidecar:v20260421-d25a17867"},"resources":{"clonerefs":{"limits":{"memory":"3Gi"},"requests":{"cpu":"100m","memory":"500Mi"}},"initupload":{"limits":{"memory":"200Mi"},"requests":{"cpu":"100m","memory":"50Mi"}},"place_entrypoint":{"limits":{"memory":"100Mi"},"requests":{"cpu":"100m","memory":"25Mi"}},"sidecar":{"limits":{"memory":"2Gi"},"requests":{"cpu":"100m","memory":"250Mi"}}},"gcs_configuration":{"bucket":"test-platform-results","path_strategy":"single","default_org":"openshift","default_repo":"origin","mediaTypes":{"log":"text/plain"},"compress_file_types":["txt","log","json","tar","html","yaml"]},"gcs_credentials_secret":"gce-sa-credentials-gcs-publisher","skip_cloning":true,"censor_secrets":true,"censoring_options":{"minimum_secret_length":6}}}'
~~       OPENSHIFT_CI: 'true'
~~       JOB_NAME_SAFE: fournos
~~       BUILD_ID: '2047708748796923904'
~~       PULL_PULL_SHA: e4793a59f7e2e43f59e4c556b9e991d59ab66043
~~       PULL_NUMBER: '42'
~~       PULL_BASE_REF: main
~~       REPO_NAME: forge
~~       REPO_OWNER: openshift-psap
~~       PULL_BASE_SHA: 42ad7b39f9de234ec569dc9583ffc70320bf264f
~~       PULL_TITLE: '[llm-d-legacy] Import the TOPSAIL legacy LLM-D project for more advance
~~         testing'
~~       PULL_REFS: main:42ad7b39f9de234ec569dc9583ffc70320bf264f,42:e4793a59f7e2e43f59e4c556b9e991d59ab66043
~~       PULL_HEAD_REF: llm_d_legacy
~~     status_dest: /logs/artifacts
~~     ci_label: pr42_b2047708748796923904
~~     artifact_dir: /logs/artifacts/001__submit_and_wait
~~ CONTEXT:
~~     final_job_name: forge-llm-d-legacy-20260424-160651
~~     manifest_file: /logs/artifacts/001__submit_and_wait/src/forge-llm-d-legacy-20260424-160651-manifest.yaml
~~
~~ EXCEPTION: RuntimeError
~~     Job forge-llm-d-legacy-20260424-160651 failed: Tasks Completed: 1 (Failed: 1, Cancelled 0), Skipped: 0
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx


[...]

Execution logs

@kpouget
Copy link
Copy Markdown
Contributor Author

kpouget commented Apr 24, 2026

/test fournos llm_d_legacy psap_h200 intelligentrouting-flavors
/cluster athena-fire
/var fournos.namespace: psap-automation-wip

@psap-forge-bot
Copy link
Copy Markdown

🟢 Test of 'llm_d_legacy test' succeeded after 00 hours 07 minutes 12 seconds 🟢

• Link to the test results.

• No reports index generated...

• No test configuration (variable_overrides.yaml) available.
Execution logs

@psap-forge-bot
Copy link
Copy Markdown

🟢 Test of 'fournos_launcher submit' succeeded after 00 hours 07 minutes 44 seconds 🟢

• Link to the test results.

• No reports index generated...

Test configuration:

/test fournos llm_d_legacy psap_h200 intelligentrouting-flavors
/cluster athena-fire
/var fournos.namespace: psap-automation-wip

Execution logs

@kpouget
Copy link
Copy Markdown
Contributor Author

kpouget commented Apr 27, 2026

I merge this to keep progressing with the framework testing.
The test past with an earlier version.

Will be further tested in upcoming PRs.

The ruff testing failures are against the legacy code, not the new one. Not fixing it.

@kpouget kpouget merged commit 6005234 into openshift-psap:main Apr 27, 2026
3 of 5 checks passed
@kpouget kpouget deleted the llm_d_legacy branch April 27, 2026 13:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant