Skip to content

feat(core): replace hardcoded worker values with constants#201

Open
deanq wants to merge 9 commits intomainfrom
deanq/ae-2158-default-min-max-workers
Open

feat(core): replace hardcoded worker values with constants#201
deanq wants to merge 9 commits intomainfrom
deanq/ae-2158-default-min-max-workers

Conversation

@deanq
Copy link
Member

@deanq deanq commented Feb 13, 2026

Summary

Replace hardcoded workersMin and workersMax values with DEFAULT_WORKERS_MIN and DEFAULT_WORKERS_MAX constants throughout the codebase for consistency and maintainability.

Changes

  • Constants: Set DEFAULT_WORKERS_MIN=0 and DEFAULT_WORKERS_MAX=1
  • ServerlessResource model: Field defaults now use constants
  • Skeleton templates: All three templates (mothership, CPU worker, GPU worker) use constants
  • Manifest builder: Both auto-provisioned and explicit mothership configs use constants
  • Public API: Constants exposed via runpod_flash package for cleaner imports
  • Lazy loading: Follows existing TYPE_CHECKING pattern for IDE support

No Breaking Changes

User worker behavior preserved:

  • Default workersMin=0 enables scale-to-zero for cost optimization
  • Default workersMax=1 provides conservative scaling limit
  • Users can explicitly override for custom scaling strategies

Mothership flexibility:

  • Auto-provisioned mothership uses DEFAULT_WORKERS_MIN (currently 0)
  • Explicit mothership configs can override via constants for rolling releases
  • Full control via constants - change once, applies everywhere

API Improvement

Constants now importable from main package:

# Before
from runpod_flash.core.resources.constants import DEFAULT_WORKERS_MAX, DEFAULT_WORKERS_MIN

# After
from runpod_flash import DEFAULT_WORKERS_MAX, DEFAULT_WORKERS_MIN

Design Rationale

Why DEFAULT_WORKERS_MIN=0:

  • Enables scale-to-zero for cost optimization across all resources
  • Users who need always-available resources can set workersMin=1 explicitly
  • Consistent with cloud-native serverless best practices

Why constants for mothership:

  • Provides single point of control for all resource defaults
  • Enables rolling releases by changing constant value
  • Allows infrastructure-wide configuration changes without code edits

Test Results

  • 960 tests passing (all tests)
  • 69.24% code coverage (above 65% requirement)
  • 20/20 constants validation tests passed
  • All quality checks passed (formatting, linting, type checking)

Files Modified

  1. src/runpod_flash/__init__.py - Added constants to TYPE_CHECKING, __getattr__, and __all__
  2. src/runpod_flash/core/resources/constants.py - Set DEFAULT_WORKERS_MIN to 0
  3. src/runpod_flash/core/resources/serverless.py - Model defaults use constants
  4. src/runpod_flash/cli/commands/build_utils/manifest.py - Mothership configs use constants
  5. src/runpod_flash/cli/utils/skeleton_template/mothership.py - Template uses constants
  6. src/runpod_flash/cli/utils/skeleton_template/workers/cpu/endpoint.py - Template uses constants
  7. src/runpod_flash/cli/utils/skeleton_template/workers/gpu/endpoint.py - Template uses constants
  8. scripts/test-image-constants.py - Updated validation for new behavior
  9. tests/unit/cli/commands/build_utils/test_manifest_mothership.py - Tests use constants

Test Plan

  • Run full test suite (make quality-check)
  • Validate constants usage (python scripts/test-image-constants.py)
  • Verify public API imports work
  • Check all quality gates pass
  • Verify no breaking changes to default behavior
  • Test mothership configuration flexibility

Related

  • Issue: AE-2158
  • Consolidates worker configuration to use centralized constants
  • Improves code maintainability and consistency
  • Enables infrastructure-wide configuration control

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors hardcoded worker configuration values to use centralized constants (DEFAULT_WORKERS_MIN and DEFAULT_WORKERS_MAX), improving maintainability and consistency across the codebase. The constants are also exposed through the public API for easier access by users.

Changes:

  • Replaced hardcoded workersMin and workersMax values with constants across model defaults and skeleton templates
  • Exposed DEFAULT_WORKERS_MIN and DEFAULT_WORKERS_MAX via the public runpod_flash package using the established lazy-loading pattern
  • Breaking change: workersMin default now 1 instead of 0, affecting scale-to-zero behavior for endpoints created without explicit configuration

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/runpod_flash/__init__.py Added constants to TYPE_CHECKING imports, __getattr__ lazy loading, and __all__ exports
src/runpod_flash/core/resources/serverless.py Replaced hardcoded defaults (0, 1) with constants (DEFAULT_WORKERS_MIN, DEFAULT_WORKERS_MAX)
src/runpod_flash/cli/utils/skeleton_template/mothership.py Updated mothership template to use constants instead of hardcoded values
src/runpod_flash/cli/utils/skeleton_template/workers/cpu/endpoint.py Updated CPU worker template to use constants
src/runpod_flash/cli/utils/skeleton_template/workers/gpu/endpoint.py Updated GPU worker template to use constants

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Replace hardcoded workersMin/workersMax values with DEFAULT_WORKERS_MIN
and DEFAULT_WORKERS_MAX constants throughout the codebase.

Changes:
- Set DEFAULT_WORKERS_MIN to 0 (enables scale-to-zero for all resources)
- Set DEFAULT_WORKERS_MAX to 1 (conservative default)
- Update ServerlessResource model defaults to use constants
- Update skeleton templates to use constants
- Expose constants in public API via lazy loading
- Follow existing TYPE_CHECKING pattern for IDE support
- Both auto-provisioned and explicit mothership configs use constants
- Allows full control via constants for rolling releases and scaling

No breaking changes:
- User worker default behavior preserved (workersMin=0 for scale-to-zero)
- Mothership behavior controlled via DEFAULT_WORKERS_MIN constant
- Users can override in explicit configs for custom scaling strategies

All tests passing (960 tests, 69.26% coverage)
Constants validation: 20/20 tests passed
@deanq deanq force-pushed the deanq/ae-2158-default-min-max-workers branch from fb3c8dc to 8c62987 Compare February 13, 2026 23:54
deanq and others added 8 commits February 13, 2026 20:13
* feat(runtime): add API key management and conditional manifest sync

Optimizes cross-endpoint communication by skipping State Manager queries
for local-only endpoints and injecting API keys only when needed.

- ServiceRegistry checks makes_remote_calls to skip unnecessary queries
- ServerlessResource injects RUNPOD_API_KEY for QB endpoints at deploy time
- Added comprehensive API key management documentation

* fix(serverless): normalize resource names for manifest lookup

Strip -fb suffix and live- prefix from resource names when looking up
configuration in manifest to ensure resources with these naming patterns
are properly matched.

* docs: add comprehensive CLI documentation for deploy, env, and app commands

Add three new documentation files following established patterns:

- flash-deploy.md: Build and deploy workflow, environment resolution,
  post-deployment guidance, preview mode, and troubleshooting
- flash-env.md: Environment management (list/create/get/delete),
  lifecycle, common workflows, and best practices
- flash-app.md: App management (list/create/get/delete), hierarchy,
  organization strategies, and relationship to environments

All docs include:
- Usage syntax and options
- Real command-line examples with output
- Conceptual explanations
- Troubleshooting sections
- Cross-references to related commands

Total: 47KB of new user-facing documentation

* docs: update CLI README with deploy, env, and app command sections

Add comprehensive sections for previously undocumented commands:

- flash deploy: Build and deploy in one step, with all options and examples
- flash env: Environment management subcommands (list/create/get/delete)
- flash app: App management subcommands (list/create/get/delete)

Each section includes:
- Usage syntax
- Key options
- Practical examples
- Link to full documentation page

Commands now follow logical workflow order:
init → run → build → deploy → env → app → undeploy

* docs: enhance main README with comprehensive CLI Reference section

Add prominent CLI Reference section before "Key concepts" with:

- Overview of all main commands (init, run, build, deploy)
- Management commands (env, app, undeploy)
- Quick examples showing common workflows
- Links to complete CLI documentation
- Individual command reference links

Also enhanced existing sections:
- Added CLI documentation link in "Create Flash API endpoints"
- Added flash run documentation link in "Step 5"
- Updated Table of Contents to include CLI Reference

Makes CLI capabilities more discoverable for new users while
providing clear path to comprehensive documentation.

* docs: update cross-references between CLI and architectural docs

flash-build.md:
- Updated "Next Steps" with links to deploy and env commands
- Enhanced "Related Commands" with bidirectional links to new docs

Flash_Deploy_Guide.md:
- Added prominent note at top distinguishing architectural guide
  from user-facing CLI documentation
- Links to flash deploy, flash env, and complete CLI docs

Ensures users can easily navigate between:
- Architectural implementation details (Deploy Guide)
- User-facing command references (CLI docs)
- Related commands within CLI documentation

* docs: update CLAUDE.md with CLI documentation update context

Document the completed work on this branch:
- Purpose: Update CLI documentation for new commands
- Status: All documentation created and cross-referenced
- Files added: flash-deploy.md, flash-env.md, flash-app.md
- Files updated: CLI README, main README, cross-references

Provides context for future work and Claude Code assistance.

* docs: fix PR #195 review issues

Addresses critical and important issues from PR review:

Critical Fixes:
- Remove CLAUDE.md template file from PR (worktree-specific, not for main)
- Replace "comprehensive" with "complete" (CLAUDE.md style compliance)
- Remove emoji from CLI documentation link

Important Fixes:
- Remove duplicated "Environment:" label in flash-env.md output examples
- Remove redundant example block in flash-app.md (flash app list)
- Remove "Examples with Real Output" sections (duplicated content)
- Remove incorrect LiveServerless SDK example (no 'url' parameter)
- Add note explaining why flash app delete requires --app flag

These changes improve documentation quality, reduce maintenance burden,
and ensure compliance with project style guidelines.

* RunPod -> Runpod

* Expand overviews for CLI commands

* Explain the difference between flash run and flash deploy

* docs: remove incorrect API key management docs

- Delete API_Key_Management.md (described buggy implementation)
- Clarify RUNPOD_API_KEY usage in LoadBalancer_Runtime_Architecture.md

The deleted doc incorrectly stated only QB endpoints get API key
injection. Per PRD.md, both LB and QB endpoints should receive
RUNPOD_API_KEY env var when they make remote calls. Current
implementation has a bug in serverless.py that needs fixing.

---------

Co-authored-by: Mo King <muhsinking@gmail.com>
* feat: add User-Agent header to all HTTP requests (AE-2106)

Add User-Agent header to all HTTP transmissions for better API analytics,
debugging, and compliance with HTTP best practices.

Changes:
- Add user_agent.py module with get_user_agent() function
- Update centralized HTTP utilities (httpx and requests)
- Update direct aiohttp usage in RunPod API clients
- Update direct requests usage in app.py for uploads/downloads
- Add comprehensive test coverage for User-Agent functionality

User-Agent format: Runpod Flash/<version> (Python <py_version>; <OS>)
Example: Runpod Flash/1.1.1 (Python 3.11.12; Darwin)

* feat: add OS version and CPU architecture to User-Agent

Enhance User-Agent header to include:
- OS version (e.g., Darwin 25.2.0)
- CPU architecture (e.g., arm64, x86_64)

New format: Runpod Flash/<version> (Python <py_version>; <OS> <OS_version>; <arch>)
Example: Runpod Flash/1.1.1 (Python 3.11.12; Darwin 25.2.0; arm64)

This provides better insights into deployment environments for debugging
and analytics purposes.
* feat: cleanup flash deploy/undeploy/build command output format

* fix: cleanup
Replace hardcoded workersMin/workersMax values with DEFAULT_WORKERS_MIN
and DEFAULT_WORKERS_MAX constants throughout the codebase.

Changes:
- Set DEFAULT_WORKERS_MIN to 0 (enables scale-to-zero for all resources)
- Set DEFAULT_WORKERS_MAX to 1 (conservative default)
- Update ServerlessResource model defaults to use constants
- Update skeleton templates to use constants
- Expose constants in public API via lazy loading
- Follow existing TYPE_CHECKING pattern for IDE support
- Both auto-provisioned and explicit mothership configs use constants
- Allows full control via constants for rolling releases and scaling

No breaking changes:
- User worker default behavior preserved (workersMin=0 for scale-to-zero)
- Mothership behavior controlled via DEFAULT_WORKERS_MIN constant
- Users can override in explicit configs for custom scaling strategies

All tests passing (960 tests, 69.26% coverage)
Constants validation: 20/20 tests passed
* refactor: remove noisy @Remote decorator debug logs

Remove debug logs from client.py that fire on every @Remote function/class:
- RUNPOD_ENDPOINT_ID/FLASH_RESOURCE_NAME environment check logs
- Local dev mode stub creation logs
- is_local_function result logs
- Original function return logs
- Remote execution wrapper creation logs

Also remove unused flash_resource_name variable that was only used in
the removed debug log.

These logs provide no actionable information during normal development
and create substantial noise (5-10 lines per decorated item).

* refactor: remove class serialization debug logs from execute_class.py

Remove 5 debug log lines that fire for every @Remote class:
- Cached class data log (line 60)
- Retrieved cached class data log (lines 84-86)
- Successfully extracted class code log (line 125)
- Generated cache key log (line 185)
- Created remote class wrapper log (line 232)

These logs fire 5-10 times per run and only matter when debugging
class serialization issues, not during normal development.

* refactor: remove per-resource discovery debug logs

Remove 2 debug log lines that fire for every decorated resource:
- Entry point resource discovery log (lines 55-57)
- Project directory resource discovery log (lines 408-410)

These logs fire 10-20 times per run. The INFO-level summary logs
already show total resource counts, making per-resource debug logs
redundant.

* refactor: remove duplicate parse failure debug logs from scanner.py

Remove 6 duplicate debug log lines across three scanning passes:
- First pass (resource configs): lines 77, 81
- Second pass (@Remote functions): lines 90, 94
- Third pass (function calls): lines 118, 122

These logs fire 3× per Python file during scanning (150+ logs for 50 files).
Parse failures in dependencies are expected and not actionable.

Keep SyntaxError warnings as they indicate actual issues.

* refactor: remove per-request load balancer debug logs

Remove 3 debug log lines that fire on every request:
- ROUND_ROBIN selection log (lines 88-91)
- LEAST_CONNECTIONS selection log (lines 112-115)
- RANDOM selection log (line 128)

These logs fire on EVERY request (100+ times per second in production)
and would flood production systems with no actionable value.

* refactor: comment out verbose API debug logs in runpod.py

Comment out (not remove) 8 debug log lines in API methods:

GraphQL (_execute_graphql):
- GraphQL Query log
- GraphQL Variables log
- GraphQL Response Status log
- GraphQL Response log

REST (_execute_rest):
- REST Request log
- REST Data log
- REST Response Status log
- REST Response log

These logs dump multi-KB JSON responses on every API call (10-50× per
deploy operation). Commenting out preserves them for future debugging
while silencing them during normal development.

Add noqa comment to json import since it's only used in commented code.

* refactor: remove verbose debug/info logs from resource_manager.py

Removed 6 noisy logs that fire per-resource operation:
- get_or_deploy_resource called with config dump
- DRIFT DEBUG with existing/new config fields
- Resource found in cache (per-lookup)
- exists, reusing (per-reuse)
- Resource NOT found in cache (per-deployment)
- Config drift detected (redundant with warning log)

* refactor: simplify DEBUG log format to remove logger name and file location

Removed %(name)s | %(filename)s:%(lineno)d from DEBUG format.
DEBUG and INFO now use the same clean format: timestamp | level | message

Updated test to match new behavior.

* refactor: remove structural change debug logs from serverless.py

Removed 3 noisy logs:
- Version-triggering changes detected (INFO)
- Structural change in field (DEBUG, 2 occurrences)

These logs fire during endpoint updates and provide no actionable value.

* refactor: silence httpcore/httpx trace logs

Set httpcore and httpx loggers to WARNING level to suppress
verbose connection/request trace logs that appear in DEBUG mode:
- connect_tcp.started/complete
- start_tls.started/complete
- send_request_headers/body
- receive_response_headers

These low-level HTTP transport logs provide no actionable value
during normal development.

* fix: prevent false redaction of Job IDs and Template IDs

Replaced overly broad TOKEN_PATTERN with PREFIXED_KEY_PATTERN that only
redacts tokens with known sensitive prefixes (sk-, key_, api_).

This fixes false positives where Job IDs, Worker IDs, and Template IDs
were being redacted even though they're not sensitive.

Updated test to use prefixed token instead of generic long token.

* refactor: remove verbose debug logs from build and API operations

Removed repetitive and overly-verbose debug logs:
- ignore.py: Remove per-file "Ignoring:" logs (pattern summary sufficient)
- app.py: Remove "already hydrated" debug log
- runpod.py: Remove logs that print full input_data/variables
  (finalizing upload, fetching environment, deploying environment)
- runpod.py: Change template update logs from info to debug
- serverless.py: Change template update log from info to debug

These logs added noise without value. Pattern summaries and
operation names provide sufficient context.

* refactor: silence file lock and asyncio debug logs

Removed verbose file locking and resource persistence logs:
- file_lock.py: Remove "File lock acquired" and "File lock released"
- resource_manager.py: Remove "Saved resources in .runpod/resources.pkl"
- logger.py: Silence asyncio logger (prevents "Using selector: KqueueSelector")

These operational details added noise without debugging value.

* refactor: remove app hydration debug logs

Removed:
- runpod.py: "Fetching flash app by name for input:"
- app.py: "Hydrating app"

These operation-level logs add noise without debugging value.
Replace hardcoded workersMin/workersMax values with DEFAULT_WORKERS_MIN
and DEFAULT_WORKERS_MAX constants throughout the codebase.

Changes:
- Set DEFAULT_WORKERS_MIN to 0 (enables scale-to-zero for all resources)
- Set DEFAULT_WORKERS_MAX to 1 (conservative default)
- Update ServerlessResource model defaults to use constants
- Update skeleton templates to use constants
- Expose constants in public API via lazy loading
- Follow existing TYPE_CHECKING pattern for IDE support
- Both auto-provisioned and explicit mothership configs use constants
- Allows full control via constants for rolling releases and scaling

No breaking changes:
- User worker default behavior preserved (workersMin=0 for scale-to-zero)
- Mothership behavior controlled via DEFAULT_WORKERS_MIN constant
- Users can override in explicit configs for custom scaling strategies

All tests passing (960 tests, 69.26% coverage)
Constants validation: 20/20 tests passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants