Generate final command stream and C-models for all Ambiq configs via run_matrix.py by Copilot · Pull Request #6 · AmbiqAI/vela_example_generator

Copilot · 2026-04-07T19:37:21Z

Summary

Implements automated matrix generation of Vela command streams and C-model files across all system configurations and memory modes defined in config/ambiq_final.ini, for all primary models in example_models/.

Changes

New: `run_matrix.py`

Orchestrates run_vela_pipeline.py across the full matrix of:

6 system configs: AmbiqLP_SRAM, AmbiqLP_HBLRAM, AmbiqLP_PSRAM, AmbiqHP_SRAM, AmbiqHP_HBLRAM, AmbiqHP_PSRAM
7 memory modes: Sram_Only, Shared_Sram, Dedicated_Sram, Shared_Sram_256KB, Dedicated_Sram_256KB, Dedicated_Sram_384KB, Dedicated_Sram_512KB
6 models: conlarge_xl, efficientnet_lite0_s8_lg, fc_in__200__o_32_relu, ic, mobilenet_v3_sm_min_s8_md, resnet_v1_8_32_tfs_int8

Key features:

Automatically detects and skips incompatible combinations (Vela rejects const_mem_area=Sram; SRAM sys-configs + Shared/Dedicated_Sram memory modes are invalid — 72 combos skipped out of 252 total)
Parses ambiq_final.ini to resolve inherit chains for memory mode compatibility
Supports --models, --sys-configs, --mem-modes filters and --dry-run

Fix: `run_vela_pipeline.py`

Updated the Vela CLI flag from the ambiguous --output to --output-dir for compatibility with vela 5.0.0.

Generated outputs under `example_models/`

180 output directories were generated (6 models × 30 valid combos each), organized as:

example_models/<model_dir>/<sys_config>_<mem_mode>/
  <model_stem>_<sys_config>_<mem_mode>_cmd_data.h   ← NPU command stream
  <model_stem>_<sys_config>_<mem_mode>_meta.h        ← tensor offsets/sizes
  <model_stem>_<sys_config>_<mem_mode>_buffers.h     ← region buffer API
  <model_stem>_<sys_config>_<mem_mode>_buffers.c
  <model_stem>_<sys_config>_<mem_mode>_run.c         ← minimal Ethos-U runner
  <model_stem>_summary_<sys_config>.csv              ← Vela performance summary

`.gitignore`

*_weights.h — excluded (up to 90 MB per config for large models; regenerable via run_matrix.py)
*_vela.npz — excluded (binary Vela intermediate; regenerable via run_matrix.py)

To regenerate the full outputs including weights, run:

python run_matrix.py

…tput files with configs in names Agent-Logs-Url: https://github.com/AmbiqAI/vela_example_generator/sessions/887f5f08-27b4-48bb-ac59-b9b567b8a82a Co-authored-by: mbhssain <214526976+mbhssain@users.noreply.github.com>

mbhssain · 2026-04-07T20:26:46Z

+# Vela intermediate binary outputs (regenerable via run_matrix.py)
+example_models/**/*_vela.npz
+
+# Weight C-array headers for matrix outputs – the raw weight data is large


Track src/*.txt files. and ignore *_weights.h

mbhssain

Update PR to have src/.txt. ignore *_weights.h

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR adds a matrix runner to generate Vela command streams and Ethos-U “direct driver” C-model artifacts across all Ambiq system configurations and memory modes, and updates the Vela pipeline invocation for Vela 5.0.0 compatibility.

Changes:

Adds run_matrix.py to orchestrate full config × memory-mode × model generation with filtering and skip logic.
Updates run_vela_pipeline.py to use Vela’s --output-dir flag.
Commits generated headers/C sources and Vela summary CSVs under example_models/, and updates .gitignore to exclude regenerable large intermediates.

Reviewed changes

Copilot reviewed 257 out of 1083 changed files in this pull request and generated 5 comments.

File	Description
example_models//_buffers.c	Generated region buffers + region accessor functions for Ethos-U invocation
example_models//_buffers.h	Generated function prototypes for region accessors
example_models//_meta.h	Generated tensor region/offset/size metadata and region constants
example_models//summary*.csv	Generated Vela performance summary outputs per configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-07T21:29:04Z

+ */
+#include <stddef.h>
+#include <stdint.h>
+#include "efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_weights.h"


The generated *_buffers.c unconditionally includes a *_weights.h header, but the PR description indicates *_weights.h is excluded via .gitignore. In a clean checkout this will cause compilation failures. Consider either (mandatory) committing the weights headers for checked-in outputs, or (preferred) modifying the generator so the checked-in sources do not depend on *_weights.h (e.g., use linker-provided symbols/extern declarations in a separate optional translation unit, or a compile-time feature flag to omit weights references when weights aren’t present).

Copilot · 2026-04-07T21:29:05Z

+uint8_t* get_region_base_ptr(int region) {
+    switch(region) {
+    case 1: return efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_region_1;
+    case 0: return (uint8_t*)efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_weights; // weights region
+    default: return (uint8_t*)0; // unused region
+    }
+}
+
+size_t get_region_size(int region) {
+    switch(region) {
+    case 1: return sizeof(efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_region_1);
+    case 0: return efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_weights_size;
+    default: return 0;
+    }
+}


get_region_base_ptr / get_region_size are defined with generic global symbol names in each generated config directory. Linking more than one generated model/config into the same binary will cause duplicate symbol link errors (or unintended symbol interposition). Make these functions model/config-specific (e.g., prefix with the model+config stem), or expose them as static and provide a unique exported wrapper/struct containing function pointers for each model.

Copilot · 2026-04-07T21:29:05Z

+__attribute__((aligned(32))) static uint8_t efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_region_1[1510640] = {0};
+
+uint8_t* get_region_base_ptr(int region) {
+    switch(region) {
+    case 1: return efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_region_1;
+    case 0: return (uint8_t*)efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_weights; // weights region
+    default: return (uint8_t*)0; // unused region
+    }
+}
+
+size_t get_region_size(int region) {
+    switch(region) {
+    case 1: return sizeof(efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_region_1);
+    case 0: return efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_weights_size;


The scratch/region buffer size is hard-coded in the C file while the same value is also emitted into the corresponding *_meta.h (..._SCRATCH_SIZE). This creates drift risk if regeneration/generator changes ever diverge. Prefer using the generated macro for the array length (and similarly use generated region-id macros for switch cases) so the buffers layer stays consistent with metadata.

Suggested change

__attribute__((aligned(32))) static uint8_t efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_region_1[1510640] = {0};

uint8_t* get_region_base_ptr(int region) {

switch(region) {

case 1: return efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_region_1;

case 0: return (uint8_t*)efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_weights; // weights region

default: return (uint8_t*)0; // unused region

}

}

size_t get_region_size(int region) {

switch(region) {

case 1: return sizeof(efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_region_1);

case 0: return efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_weights_size;

__attribute__((aligned(32))) static uint8_t efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_region_1[efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_SCRATCH_SIZE] = {0};

uint8_t* get_region_base_ptr(int region) {

switch(region) {

case efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_SCRATCH_REGION: return efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_region_1;

case efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_WEIGHTS_REGION: return (uint8_t*)efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_weights; // weights region

default: return (uint8_t*)0; // unused region

}

}

size_t get_region_size(int region) {

switch(region) {

case efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_SCRATCH_REGION: return sizeof(efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_region_1);

case efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_WEIGHTS_REGION: return efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_weights_size;

Copilot · 2026-04-07T21:29:05Z

+// Base-pointer array length for Ethos-U
+#define ETHOSU_MAX_REGIONS 8


ETHOSU_MAX_REGIONS is defined as a generic macro in each generated *_meta.h. If multiple meta headers are included in a single build, this risks macro redefinition warnings/errors and makes it harder to change per-model in the future. Prefer either a project-level common definition, or a model-prefixed macro name (e.g., <MODEL>_<CONFIG>_ETHOSU_MAX_REGIONS), or guard with #ifndef ETHOSU_MAX_REGIONS if a single global value is intended.

Suggested change

// Base-pointer array length for Ethos-U

#define ETHOSU_MAX_REGIONS 8

// Base-pointer array length for Ethos-U

#ifndef ETHOSU_MAX_REGIONS

#define ETHOSU_MAX_REGIONS 8

#endif

Copilot · 2026-04-07T21:29:06Z

+    switch(region) {
+    case 1: return efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_region_1;
+    case 0: return (uint8_t*)efficientnet_lite0_s8_lg_AmbiqLP_HBLRAM_Shared_Sram_weights; // weights region
+    default: return (uint8_t*)0; // unused region


Returning (uint8_t*)0 is less idiomatic and less clear than returning NULL. Consider including <stddef.h> (already present) and returning NULL for unknown regions; this also makes it easier to add diagnostics (assert/log) later if desired.

Suggested change

default: return (uint8_t*)0; // unused region

default: return NULL; // unused region

Copilot AI assigned Copilot and mbhssain Apr 7, 2026

Copilot created this pull request from a session on behalf of mbhssain April 7, 2026 19:38 View session

Copilot AI requested a review from mbhssain April 7, 2026 19:38

Copilot finished work on behalf of mbhssain April 7, 2026 19:38

mbhssain reviewed Apr 7, 2026

View reviewed changes

mbhssain requested a review from Copilot April 7, 2026 21:27

Copilot AI reviewed Apr 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate final command stream and C-models for all Ambiq configs via run_matrix.py#6

Generate final command stream and C-models for all Ambiq configs via run_matrix.py#6
Copilot wants to merge 1 commit into
add_modelsfrom
copilot/generate-final-command-stream

Copilot AI commented Apr 7, 2026

Uh oh!

mbhssain Apr 7, 2026

Uh oh!

mbhssain left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		// Base-pointer array length for Ethos-U
		#define ETHOSU_MAX_REGIONS 8

	default: return (uint8_t*)0; // unused region
	default: return NULL; // unused region

Conversation

Copilot AI commented Apr 7, 2026

Summary

Changes

New: run_matrix.py

Fix: run_vela_pipeline.py

Generated outputs under example_models/

.gitignore

Uh oh!

mbhssain Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

mbhssain left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

New: `run_matrix.py`

Fix: `run_vela_pipeline.py`

Generated outputs under `example_models/`

`.gitignore`