feat(core): replace hardcoded worker values with constants by deanq · Pull Request #201 · runpod/flash

deanq · 2026-02-13T22:23:24Z

Summary

Replace hardcoded workersMin and workersMax values with DEFAULT_WORKERS_MIN and DEFAULT_WORKERS_MAX constants throughout the codebase for consistency and maintainability.

Changes

Constants: Set DEFAULT_WORKERS_MIN=0 and DEFAULT_WORKERS_MAX=1
ServerlessResource model: Field defaults now use constants
Skeleton templates: All three templates (mothership, CPU worker, GPU worker) use constants
Manifest builder: Both auto-provisioned and explicit mothership configs use constants
Public API: Constants exposed via runpod_flash package for cleaner imports
Lazy loading: Follows existing TYPE_CHECKING pattern for IDE support

No Breaking Changes

User worker behavior preserved:

Default workersMin=0 enables scale-to-zero for cost optimization
Default workersMax=1 provides conservative scaling limit
Users can explicitly override for custom scaling strategies

Mothership flexibility:

Auto-provisioned mothership uses DEFAULT_WORKERS_MIN (currently 0)
Explicit mothership configs can override via constants for rolling releases
Full control via constants - change once, applies everywhere

API Improvement

Constants now importable from main package:

# Before
from runpod_flash.core.resources.constants import DEFAULT_WORKERS_MAX, DEFAULT_WORKERS_MIN

# After
from runpod_flash import DEFAULT_WORKERS_MAX, DEFAULT_WORKERS_MIN

Design Rationale

Why DEFAULT_WORKERS_MIN=0:

Enables scale-to-zero for cost optimization across all resources
Users who need always-available resources can set workersMin=1 explicitly
Consistent with cloud-native serverless best practices

Why constants for mothership:

Provides single point of control for all resource defaults
Enables rolling releases by changing constant value
Allows infrastructure-wide configuration changes without code edits

Test Results

✅ 960 tests passing (all tests)
✅ 69.24% code coverage (above 65% requirement)
✅ 20/20 constants validation tests passed
✅ All quality checks passed (formatting, linting, type checking)

Files Modified

src/runpod_flash/__init__.py - Added constants to TYPE_CHECKING, __getattr__, and __all__
src/runpod_flash/core/resources/constants.py - Set DEFAULT_WORKERS_MIN to 0
src/runpod_flash/core/resources/serverless.py - Model defaults use constants
src/runpod_flash/cli/commands/build_utils/manifest.py - Mothership configs use constants
src/runpod_flash/cli/utils/skeleton_template/mothership.py - Template uses constants
src/runpod_flash/cli/utils/skeleton_template/workers/cpu/endpoint.py - Template uses constants
src/runpod_flash/cli/utils/skeleton_template/workers/gpu/endpoint.py - Template uses constants
scripts/test-image-constants.py - Updated validation for new behavior
tests/unit/cli/commands/build_utils/test_manifest_mothership.py - Tests use constants

Test Plan

Run full test suite (make quality-check)
Validate constants usage (python scripts/test-image-constants.py)
Verify public API imports work
Check all quality gates pass
Verify no breaking changes to default behavior
Test mothership configuration flexibility

Pull request overview

This PR refactors hardcoded worker configuration values to use centralized constants (DEFAULT_WORKERS_MIN and DEFAULT_WORKERS_MAX), improving maintainability and consistency across the codebase. The constants are also exposed through the public API for easier access by users.

Changes:

Replaced hardcoded workersMin and workersMax values with constants across model defaults and skeleton templates
Exposed DEFAULT_WORKERS_MIN and DEFAULT_WORKERS_MAX via the public runpod_flash package using the established lazy-loading pattern
Breaking change: workersMin default now 1 instead of 0, affecting scale-to-zero behavior for endpoints created without explicit configuration

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
`src/runpod_flash/__init__.py`	Added constants to TYPE_CHECKING imports, `__getattr__` lazy loading, and `__all__` exports
`src/runpod_flash/core/resources/serverless.py`	Replaced hardcoded defaults (0, 1) with constants (DEFAULT_WORKERS_MIN, DEFAULT_WORKERS_MAX)
`src/runpod_flash/cli/utils/skeleton_template/mothership.py`	Updated mothership template to use constants instead of hardcoded values
`src/runpod_flash/cli/utils/skeleton_template/workers/cpu/endpoint.py`	Updated CPU worker template to use constants
`src/runpod_flash/cli/utils/skeleton_template/workers/gpu/endpoint.py`	Updated GPU worker template to use constants

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Replace hardcoded workersMin/workersMax values with DEFAULT_WORKERS_MIN and DEFAULT_WORKERS_MAX constants throughout the codebase. Changes: - Set DEFAULT_WORKERS_MIN to 0 (enables scale-to-zero for all resources) - Set DEFAULT_WORKERS_MAX to 1 (conservative default) - Update ServerlessResource model defaults to use constants - Update skeleton templates to use constants - Expose constants in public API via lazy loading - Follow existing TYPE_CHECKING pattern for IDE support - Both auto-provisioned and explicit mothership configs use constants - Allows full control via constants for rolling releases and scaling No breaking changes: - User worker default behavior preserved (workersMin=0 for scale-to-zero) - Mothership behavior controlled via DEFAULT_WORKERS_MIN constant - Users can override in explicit configs for custom scaling strategies All tests passing (960 tests, 69.26% coverage) Constants validation: 20/20 tests passed

* feat(runtime): add API key management and conditional manifest sync Optimizes cross-endpoint communication by skipping State Manager queries for local-only endpoints and injecting API keys only when needed. - ServiceRegistry checks makes_remote_calls to skip unnecessary queries - ServerlessResource injects RUNPOD_API_KEY for QB endpoints at deploy time - Added comprehensive API key management documentation * fix(serverless): normalize resource names for manifest lookup Strip -fb suffix and live- prefix from resource names when looking up configuration in manifest to ensure resources with these naming patterns are properly matched. * docs: add comprehensive CLI documentation for deploy, env, and app commands Add three new documentation files following established patterns: - flash-deploy.md: Build and deploy workflow, environment resolution, post-deployment guidance, preview mode, and troubleshooting - flash-env.md: Environment management (list/create/get/delete), lifecycle, common workflows, and best practices - flash-app.md: App management (list/create/get/delete), hierarchy, organization strategies, and relationship to environments All docs include: - Usage syntax and options - Real command-line examples with output - Conceptual explanations - Troubleshooting sections - Cross-references to related commands Total: 47KB of new user-facing documentation * docs: update CLI README with deploy, env, and app command sections Add comprehensive sections for previously undocumented commands: - flash deploy: Build and deploy in one step, with all options and examples - flash env: Environment management subcommands (list/create/get/delete) - flash app: App management subcommands (list/create/get/delete) Each section includes: - Usage syntax - Key options - Practical examples - Link to full documentation page Commands now follow logical workflow order: init → run → build → deploy → env → app → undeploy * docs: enhance main README with comprehensive CLI Reference section Add prominent CLI Reference section before "Key concepts" with: - Overview of all main commands (init, run, build, deploy) - Management commands (env, app, undeploy) - Quick examples showing common workflows - Links to complete CLI documentation - Individual command reference links Also enhanced existing sections: - Added CLI documentation link in "Create Flash API endpoints" - Added flash run documentation link in "Step 5" - Updated Table of Contents to include CLI Reference Makes CLI capabilities more discoverable for new users while providing clear path to comprehensive documentation. * docs: update cross-references between CLI and architectural docs flash-build.md: - Updated "Next Steps" with links to deploy and env commands - Enhanced "Related Commands" with bidirectional links to new docs Flash_Deploy_Guide.md: - Added prominent note at top distinguishing architectural guide from user-facing CLI documentation - Links to flash deploy, flash env, and complete CLI docs Ensures users can easily navigate between: - Architectural implementation details (Deploy Guide) - User-facing command references (CLI docs) - Related commands within CLI documentation * docs: update CLAUDE.md with CLI documentation update context Document the completed work on this branch: - Purpose: Update CLI documentation for new commands - Status: All documentation created and cross-referenced - Files added: flash-deploy.md, flash-env.md, flash-app.md - Files updated: CLI README, main README, cross-references Provides context for future work and Claude Code assistance. * docs: fix PR #195 review issues Addresses critical and important issues from PR review: Critical Fixes: - Remove CLAUDE.md template file from PR (worktree-specific, not for main) - Replace "comprehensive" with "complete" (CLAUDE.md style compliance) - Remove emoji from CLI documentation link Important Fixes: - Remove duplicated "Environment:" label in flash-env.md output examples - Remove redundant example block in flash-app.md (flash app list) - Remove "Examples with Real Output" sections (duplicated content) - Remove incorrect LiveServerless SDK example (no 'url' parameter) - Add note explaining why flash app delete requires --app flag These changes improve documentation quality, reduce maintenance burden, and ensure compliance with project style guidelines. * RunPod -> Runpod * Expand overviews for CLI commands * Explain the difference between flash run and flash deploy * docs: remove incorrect API key management docs - Delete API_Key_Management.md (described buggy implementation) - Clarify RUNPOD_API_KEY usage in LoadBalancer_Runtime_Architecture.md The deleted doc incorrectly stated only QB endpoints get API key injection. Per PRD.md, both LB and QB endpoints should receive RUNPOD_API_KEY env var when they make remote calls. Current implementation has a bug in serverless.py that needs fixing. --------- Co-authored-by: Mo King <muhsinking@gmail.com>

* feat: add User-Agent header to all HTTP requests (AE-2106) Add User-Agent header to all HTTP transmissions for better API analytics, debugging, and compliance with HTTP best practices. Changes: - Add user_agent.py module with get_user_agent() function - Update centralized HTTP utilities (httpx and requests) - Update direct aiohttp usage in RunPod API clients - Update direct requests usage in app.py for uploads/downloads - Add comprehensive test coverage for User-Agent functionality User-Agent format: Runpod Flash/<version> (Python <py_version>; <OS>) Example: Runpod Flash/1.1.1 (Python 3.11.12; Darwin) * feat: add OS version and CPU architecture to User-Agent Enhance User-Agent header to include: - OS version (e.g., Darwin 25.2.0) - CPU architecture (e.g., arm64, x86_64) New format: Runpod Flash/<version> (Python <py_version>; <OS> <OS_version>; <arch>) Example: Runpod Flash/1.1.1 (Python 3.11.12; Darwin 25.2.0; arm64) This provides better insights into deployment environments for debugging and analytics purposes.

* feat: cleanup flash deploy/undeploy/build command output format * fix: cleanup

Replace hardcoded workersMin/workersMax values with DEFAULT_WORKERS_MIN and DEFAULT_WORKERS_MAX constants throughout the codebase. Changes: - Set DEFAULT_WORKERS_MIN to 0 (enables scale-to-zero for all resources) - Set DEFAULT_WORKERS_MAX to 1 (conservative default) - Update ServerlessResource model defaults to use constants - Update skeleton templates to use constants - Expose constants in public API via lazy loading - Follow existing TYPE_CHECKING pattern for IDE support - Both auto-provisioned and explicit mothership configs use constants - Allows full control via constants for rolling releases and scaling No breaking changes: - User worker default behavior preserved (workersMin=0 for scale-to-zero) - Mothership behavior controlled via DEFAULT_WORKERS_MIN constant - Users can override in explicit configs for custom scaling strategies All tests passing (960 tests, 69.26% coverage) Constants validation: 20/20 tests passed

…b.com/runpod/flash into deanq/ae-2158-default-min-max-workers

* refactor: remove noisy @Remote decorator debug logs Remove debug logs from client.py that fire on every @Remote function/class: - RUNPOD_ENDPOINT_ID/FLASH_RESOURCE_NAME environment check logs - Local dev mode stub creation logs - is_local_function result logs - Original function return logs - Remote execution wrapper creation logs Also remove unused flash_resource_name variable that was only used in the removed debug log. These logs provide no actionable information during normal development and create substantial noise (5-10 lines per decorated item). * refactor: remove class serialization debug logs from execute_class.py Remove 5 debug log lines that fire for every @Remote class: - Cached class data log (line 60) - Retrieved cached class data log (lines 84-86) - Successfully extracted class code log (line 125) - Generated cache key log (line 185) - Created remote class wrapper log (line 232) These logs fire 5-10 times per run and only matter when debugging class serialization issues, not during normal development. * refactor: remove per-resource discovery debug logs Remove 2 debug log lines that fire for every decorated resource: - Entry point resource discovery log (lines 55-57) - Project directory resource discovery log (lines 408-410) These logs fire 10-20 times per run. The INFO-level summary logs already show total resource counts, making per-resource debug logs redundant. * refactor: remove duplicate parse failure debug logs from scanner.py Remove 6 duplicate debug log lines across three scanning passes: - First pass (resource configs): lines 77, 81 - Second pass (@Remote functions): lines 90, 94 - Third pass (function calls): lines 118, 122 These logs fire 3× per Python file during scanning (150+ logs for 50 files). Parse failures in dependencies are expected and not actionable. Keep SyntaxError warnings as they indicate actual issues. * refactor: remove per-request load balancer debug logs Remove 3 debug log lines that fire on every request: - ROUND_ROBIN selection log (lines 88-91) - LEAST_CONNECTIONS selection log (lines 112-115) - RANDOM selection log (line 128) These logs fire on EVERY request (100+ times per second in production) and would flood production systems with no actionable value. * refactor: comment out verbose API debug logs in runpod.py Comment out (not remove) 8 debug log lines in API methods: GraphQL (_execute_graphql): - GraphQL Query log - GraphQL Variables log - GraphQL Response Status log - GraphQL Response log REST (_execute_rest): - REST Request log - REST Data log - REST Response Status log - REST Response log These logs dump multi-KB JSON responses on every API call (10-50× per deploy operation). Commenting out preserves them for future debugging while silencing them during normal development. Add noqa comment to json import since it's only used in commented code. * refactor: remove verbose debug/info logs from resource_manager.py Removed 6 noisy logs that fire per-resource operation: - get_or_deploy_resource called with config dump - DRIFT DEBUG with existing/new config fields - Resource found in cache (per-lookup) - exists, reusing (per-reuse) - Resource NOT found in cache (per-deployment) - Config drift detected (redundant with warning log) * refactor: simplify DEBUG log format to remove logger name and file location Removed %(name)s | %(filename)s:%(lineno)d from DEBUG format. DEBUG and INFO now use the same clean format: timestamp | level | message Updated test to match new behavior. * refactor: remove structural change debug logs from serverless.py Removed 3 noisy logs: - Version-triggering changes detected (INFO) - Structural change in field (DEBUG, 2 occurrences) These logs fire during endpoint updates and provide no actionable value. * refactor: silence httpcore/httpx trace logs Set httpcore and httpx loggers to WARNING level to suppress verbose connection/request trace logs that appear in DEBUG mode: - connect_tcp.started/complete - start_tls.started/complete - send_request_headers/body - receive_response_headers These low-level HTTP transport logs provide no actionable value during normal development. * fix: prevent false redaction of Job IDs and Template IDs Replaced overly broad TOKEN_PATTERN with PREFIXED_KEY_PATTERN that only redacts tokens with known sensitive prefixes (sk-, key_, api_). This fixes false positives where Job IDs, Worker IDs, and Template IDs were being redacted even though they're not sensitive. Updated test to use prefixed token instead of generic long token. * refactor: remove verbose debug logs from build and API operations Removed repetitive and overly-verbose debug logs: - ignore.py: Remove per-file "Ignoring:" logs (pattern summary sufficient) - app.py: Remove "already hydrated" debug log - runpod.py: Remove logs that print full input_data/variables (finalizing upload, fetching environment, deploying environment) - runpod.py: Change template update logs from info to debug - serverless.py: Change template update log from info to debug These logs added noise without value. Pattern summaries and operation names provide sufficient context. * refactor: silence file lock and asyncio debug logs Removed verbose file locking and resource persistence logs: - file_lock.py: Remove "File lock acquired" and "File lock released" - resource_manager.py: Remove "Saved resources in .runpod/resources.pkl" - logger.py: Silence asyncio logger (prevents "Using selector: KqueueSelector") These operational details added noise without debugging value. * refactor: remove app hydration debug logs Removed: - runpod.py: "Fetching flash app by name for input:" - app.py: "Hydrating app" These operation-level logs add noise without debugging value.

Replace hardcoded workersMin/workersMax values with DEFAULT_WORKERS_MIN and DEFAULT_WORKERS_MAX constants throughout the codebase. Changes: - Set DEFAULT_WORKERS_MIN to 0 (enables scale-to-zero for all resources) - Set DEFAULT_WORKERS_MAX to 1 (conservative default) - Update ServerlessResource model defaults to use constants - Update skeleton templates to use constants - Expose constants in public API via lazy loading - Follow existing TYPE_CHECKING pattern for IDE support - Both auto-provisioned and explicit mothership configs use constants - Allows full control via constants for rolling releases and scaling No breaking changes: - User worker default behavior preserved (workersMin=0 for scale-to-zero) - Mothership behavior controlled via DEFAULT_WORKERS_MIN constant - Users can override in explicit configs for custom scaling strategies All tests passing (960 tests, 69.26% coverage) Constants validation: 20/20 tests passed

…b.com/runpod/flash into deanq/ae-2158-default-min-max-workers

deanq requested a review from Copilot February 13, 2026 23:16

Copilot started reviewing on behalf of deanq February 13, 2026 23:17 View session

Copilot AI reviewed Feb 13, 2026

View reviewed changes

deanq force-pushed the deanq/ae-2158-default-min-max-workers branch from fb3c8dc to 8c62987 Compare February 13, 2026 23:54

jhcipar approved these changes Feb 14, 2026

View reviewed changes

deanq and others added 8 commits February 13, 2026 20:13

feat: cleanup flash deploy/undeploy/build command output format (#191)

c99b486

* feat: cleanup flash deploy/undeploy/build command output format * fix: cleanup

Merge branch 'deanq/ae-2158-default-min-max-workers' of https://githu…

97dbbb6

…b.com/runpod/flash into deanq/ae-2158-default-min-max-workers

Merge branch 'deanq/ae-2158-default-min-max-workers' of https://githu…

e175d7e

…b.com/runpod/flash into deanq/ae-2158-default-min-max-workers

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(core): replace hardcoded worker values with constants#201

feat(core): replace hardcoded worker values with constants#201
deanq wants to merge 9 commits intomainfrom
deanq/ae-2158-default-min-max-workers

deanq commented Feb 13, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

deanq commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

No Breaking Changes

API Improvement

Design Rationale

Test Results

Files Modified

Test Plan

Related

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

deanq commented Feb 13, 2026 •

edited

Loading