feat(core): replace hardcoded worker values with constants#201
Open
feat(core): replace hardcoded worker values with constants#201
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR refactors hardcoded worker configuration values to use centralized constants (DEFAULT_WORKERS_MIN and DEFAULT_WORKERS_MAX), improving maintainability and consistency across the codebase. The constants are also exposed through the public API for easier access by users.
Changes:
- Replaced hardcoded
workersMinandworkersMaxvalues with constants across model defaults and skeleton templates - Exposed
DEFAULT_WORKERS_MINandDEFAULT_WORKERS_MAXvia the publicrunpod_flashpackage using the established lazy-loading pattern - Breaking change:
workersMindefault now 1 instead of 0, affecting scale-to-zero behavior for endpoints created without explicit configuration
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
src/runpod_flash/__init__.py |
Added constants to TYPE_CHECKING imports, __getattr__ lazy loading, and __all__ exports |
src/runpod_flash/core/resources/serverless.py |
Replaced hardcoded defaults (0, 1) with constants (DEFAULT_WORKERS_MIN, DEFAULT_WORKERS_MAX) |
src/runpod_flash/cli/utils/skeleton_template/mothership.py |
Updated mothership template to use constants instead of hardcoded values |
src/runpod_flash/cli/utils/skeleton_template/workers/cpu/endpoint.py |
Updated CPU worker template to use constants |
src/runpod_flash/cli/utils/skeleton_template/workers/gpu/endpoint.py |
Updated GPU worker template to use constants |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Replace hardcoded workersMin/workersMax values with DEFAULT_WORKERS_MIN and DEFAULT_WORKERS_MAX constants throughout the codebase. Changes: - Set DEFAULT_WORKERS_MIN to 0 (enables scale-to-zero for all resources) - Set DEFAULT_WORKERS_MAX to 1 (conservative default) - Update ServerlessResource model defaults to use constants - Update skeleton templates to use constants - Expose constants in public API via lazy loading - Follow existing TYPE_CHECKING pattern for IDE support - Both auto-provisioned and explicit mothership configs use constants - Allows full control via constants for rolling releases and scaling No breaking changes: - User worker default behavior preserved (workersMin=0 for scale-to-zero) - Mothership behavior controlled via DEFAULT_WORKERS_MIN constant - Users can override in explicit configs for custom scaling strategies All tests passing (960 tests, 69.26% coverage) Constants validation: 20/20 tests passed
fb3c8dc to
8c62987
Compare
jhcipar
approved these changes
Feb 14, 2026
* feat(runtime): add API key management and conditional manifest sync Optimizes cross-endpoint communication by skipping State Manager queries for local-only endpoints and injecting API keys only when needed. - ServiceRegistry checks makes_remote_calls to skip unnecessary queries - ServerlessResource injects RUNPOD_API_KEY for QB endpoints at deploy time - Added comprehensive API key management documentation * fix(serverless): normalize resource names for manifest lookup Strip -fb suffix and live- prefix from resource names when looking up configuration in manifest to ensure resources with these naming patterns are properly matched. * docs: add comprehensive CLI documentation for deploy, env, and app commands Add three new documentation files following established patterns: - flash-deploy.md: Build and deploy workflow, environment resolution, post-deployment guidance, preview mode, and troubleshooting - flash-env.md: Environment management (list/create/get/delete), lifecycle, common workflows, and best practices - flash-app.md: App management (list/create/get/delete), hierarchy, organization strategies, and relationship to environments All docs include: - Usage syntax and options - Real command-line examples with output - Conceptual explanations - Troubleshooting sections - Cross-references to related commands Total: 47KB of new user-facing documentation * docs: update CLI README with deploy, env, and app command sections Add comprehensive sections for previously undocumented commands: - flash deploy: Build and deploy in one step, with all options and examples - flash env: Environment management subcommands (list/create/get/delete) - flash app: App management subcommands (list/create/get/delete) Each section includes: - Usage syntax - Key options - Practical examples - Link to full documentation page Commands now follow logical workflow order: init → run → build → deploy → env → app → undeploy * docs: enhance main README with comprehensive CLI Reference section Add prominent CLI Reference section before "Key concepts" with: - Overview of all main commands (init, run, build, deploy) - Management commands (env, app, undeploy) - Quick examples showing common workflows - Links to complete CLI documentation - Individual command reference links Also enhanced existing sections: - Added CLI documentation link in "Create Flash API endpoints" - Added flash run documentation link in "Step 5" - Updated Table of Contents to include CLI Reference Makes CLI capabilities more discoverable for new users while providing clear path to comprehensive documentation. * docs: update cross-references between CLI and architectural docs flash-build.md: - Updated "Next Steps" with links to deploy and env commands - Enhanced "Related Commands" with bidirectional links to new docs Flash_Deploy_Guide.md: - Added prominent note at top distinguishing architectural guide from user-facing CLI documentation - Links to flash deploy, flash env, and complete CLI docs Ensures users can easily navigate between: - Architectural implementation details (Deploy Guide) - User-facing command references (CLI docs) - Related commands within CLI documentation * docs: update CLAUDE.md with CLI documentation update context Document the completed work on this branch: - Purpose: Update CLI documentation for new commands - Status: All documentation created and cross-referenced - Files added: flash-deploy.md, flash-env.md, flash-app.md - Files updated: CLI README, main README, cross-references Provides context for future work and Claude Code assistance. * docs: fix PR #195 review issues Addresses critical and important issues from PR review: Critical Fixes: - Remove CLAUDE.md template file from PR (worktree-specific, not for main) - Replace "comprehensive" with "complete" (CLAUDE.md style compliance) - Remove emoji from CLI documentation link Important Fixes: - Remove duplicated "Environment:" label in flash-env.md output examples - Remove redundant example block in flash-app.md (flash app list) - Remove "Examples with Real Output" sections (duplicated content) - Remove incorrect LiveServerless SDK example (no 'url' parameter) - Add note explaining why flash app delete requires --app flag These changes improve documentation quality, reduce maintenance burden, and ensure compliance with project style guidelines. * RunPod -> Runpod * Expand overviews for CLI commands * Explain the difference between flash run and flash deploy * docs: remove incorrect API key management docs - Delete API_Key_Management.md (described buggy implementation) - Clarify RUNPOD_API_KEY usage in LoadBalancer_Runtime_Architecture.md The deleted doc incorrectly stated only QB endpoints get API key injection. Per PRD.md, both LB and QB endpoints should receive RUNPOD_API_KEY env var when they make remote calls. Current implementation has a bug in serverless.py that needs fixing. --------- Co-authored-by: Mo King <muhsinking@gmail.com>
* feat: add User-Agent header to all HTTP requests (AE-2106) Add User-Agent header to all HTTP transmissions for better API analytics, debugging, and compliance with HTTP best practices. Changes: - Add user_agent.py module with get_user_agent() function - Update centralized HTTP utilities (httpx and requests) - Update direct aiohttp usage in RunPod API clients - Update direct requests usage in app.py for uploads/downloads - Add comprehensive test coverage for User-Agent functionality User-Agent format: Runpod Flash/<version> (Python <py_version>; <OS>) Example: Runpod Flash/1.1.1 (Python 3.11.12; Darwin) * feat: add OS version and CPU architecture to User-Agent Enhance User-Agent header to include: - OS version (e.g., Darwin 25.2.0) - CPU architecture (e.g., arm64, x86_64) New format: Runpod Flash/<version> (Python <py_version>; <OS> <OS_version>; <arch>) Example: Runpod Flash/1.1.1 (Python 3.11.12; Darwin 25.2.0; arm64) This provides better insights into deployment environments for debugging and analytics purposes.
* feat: cleanup flash deploy/undeploy/build command output format * fix: cleanup
Replace hardcoded workersMin/workersMax values with DEFAULT_WORKERS_MIN and DEFAULT_WORKERS_MAX constants throughout the codebase. Changes: - Set DEFAULT_WORKERS_MIN to 0 (enables scale-to-zero for all resources) - Set DEFAULT_WORKERS_MAX to 1 (conservative default) - Update ServerlessResource model defaults to use constants - Update skeleton templates to use constants - Expose constants in public API via lazy loading - Follow existing TYPE_CHECKING pattern for IDE support - Both auto-provisioned and explicit mothership configs use constants - Allows full control via constants for rolling releases and scaling No breaking changes: - User worker default behavior preserved (workersMin=0 for scale-to-zero) - Mothership behavior controlled via DEFAULT_WORKERS_MIN constant - Users can override in explicit configs for custom scaling strategies All tests passing (960 tests, 69.26% coverage) Constants validation: 20/20 tests passed
…b.com/runpod/flash into deanq/ae-2158-default-min-max-workers
* refactor: remove noisy @Remote decorator debug logs Remove debug logs from client.py that fire on every @Remote function/class: - RUNPOD_ENDPOINT_ID/FLASH_RESOURCE_NAME environment check logs - Local dev mode stub creation logs - is_local_function result logs - Original function return logs - Remote execution wrapper creation logs Also remove unused flash_resource_name variable that was only used in the removed debug log. These logs provide no actionable information during normal development and create substantial noise (5-10 lines per decorated item). * refactor: remove class serialization debug logs from execute_class.py Remove 5 debug log lines that fire for every @Remote class: - Cached class data log (line 60) - Retrieved cached class data log (lines 84-86) - Successfully extracted class code log (line 125) - Generated cache key log (line 185) - Created remote class wrapper log (line 232) These logs fire 5-10 times per run and only matter when debugging class serialization issues, not during normal development. * refactor: remove per-resource discovery debug logs Remove 2 debug log lines that fire for every decorated resource: - Entry point resource discovery log (lines 55-57) - Project directory resource discovery log (lines 408-410) These logs fire 10-20 times per run. The INFO-level summary logs already show total resource counts, making per-resource debug logs redundant. * refactor: remove duplicate parse failure debug logs from scanner.py Remove 6 duplicate debug log lines across three scanning passes: - First pass (resource configs): lines 77, 81 - Second pass (@Remote functions): lines 90, 94 - Third pass (function calls): lines 118, 122 These logs fire 3× per Python file during scanning (150+ logs for 50 files). Parse failures in dependencies are expected and not actionable. Keep SyntaxError warnings as they indicate actual issues. * refactor: remove per-request load balancer debug logs Remove 3 debug log lines that fire on every request: - ROUND_ROBIN selection log (lines 88-91) - LEAST_CONNECTIONS selection log (lines 112-115) - RANDOM selection log (line 128) These logs fire on EVERY request (100+ times per second in production) and would flood production systems with no actionable value. * refactor: comment out verbose API debug logs in runpod.py Comment out (not remove) 8 debug log lines in API methods: GraphQL (_execute_graphql): - GraphQL Query log - GraphQL Variables log - GraphQL Response Status log - GraphQL Response log REST (_execute_rest): - REST Request log - REST Data log - REST Response Status log - REST Response log These logs dump multi-KB JSON responses on every API call (10-50× per deploy operation). Commenting out preserves them for future debugging while silencing them during normal development. Add noqa comment to json import since it's only used in commented code. * refactor: remove verbose debug/info logs from resource_manager.py Removed 6 noisy logs that fire per-resource operation: - get_or_deploy_resource called with config dump - DRIFT DEBUG with existing/new config fields - Resource found in cache (per-lookup) - exists, reusing (per-reuse) - Resource NOT found in cache (per-deployment) - Config drift detected (redundant with warning log) * refactor: simplify DEBUG log format to remove logger name and file location Removed %(name)s | %(filename)s:%(lineno)d from DEBUG format. DEBUG and INFO now use the same clean format: timestamp | level | message Updated test to match new behavior. * refactor: remove structural change debug logs from serverless.py Removed 3 noisy logs: - Version-triggering changes detected (INFO) - Structural change in field (DEBUG, 2 occurrences) These logs fire during endpoint updates and provide no actionable value. * refactor: silence httpcore/httpx trace logs Set httpcore and httpx loggers to WARNING level to suppress verbose connection/request trace logs that appear in DEBUG mode: - connect_tcp.started/complete - start_tls.started/complete - send_request_headers/body - receive_response_headers These low-level HTTP transport logs provide no actionable value during normal development. * fix: prevent false redaction of Job IDs and Template IDs Replaced overly broad TOKEN_PATTERN with PREFIXED_KEY_PATTERN that only redacts tokens with known sensitive prefixes (sk-, key_, api_). This fixes false positives where Job IDs, Worker IDs, and Template IDs were being redacted even though they're not sensitive. Updated test to use prefixed token instead of generic long token. * refactor: remove verbose debug logs from build and API operations Removed repetitive and overly-verbose debug logs: - ignore.py: Remove per-file "Ignoring:" logs (pattern summary sufficient) - app.py: Remove "already hydrated" debug log - runpod.py: Remove logs that print full input_data/variables (finalizing upload, fetching environment, deploying environment) - runpod.py: Change template update logs from info to debug - serverless.py: Change template update log from info to debug These logs added noise without value. Pattern summaries and operation names provide sufficient context. * refactor: silence file lock and asyncio debug logs Removed verbose file locking and resource persistence logs: - file_lock.py: Remove "File lock acquired" and "File lock released" - resource_manager.py: Remove "Saved resources in .runpod/resources.pkl" - logger.py: Silence asyncio logger (prevents "Using selector: KqueueSelector") These operational details added noise without debugging value. * refactor: remove app hydration debug logs Removed: - runpod.py: "Fetching flash app by name for input:" - app.py: "Hydrating app" These operation-level logs add noise without debugging value.
Replace hardcoded workersMin/workersMax values with DEFAULT_WORKERS_MIN and DEFAULT_WORKERS_MAX constants throughout the codebase. Changes: - Set DEFAULT_WORKERS_MIN to 0 (enables scale-to-zero for all resources) - Set DEFAULT_WORKERS_MAX to 1 (conservative default) - Update ServerlessResource model defaults to use constants - Update skeleton templates to use constants - Expose constants in public API via lazy loading - Follow existing TYPE_CHECKING pattern for IDE support - Both auto-provisioned and explicit mothership configs use constants - Allows full control via constants for rolling releases and scaling No breaking changes: - User worker default behavior preserved (workersMin=0 for scale-to-zero) - Mothership behavior controlled via DEFAULT_WORKERS_MIN constant - Users can override in explicit configs for custom scaling strategies All tests passing (960 tests, 69.26% coverage) Constants validation: 20/20 tests passed
…b.com/runpod/flash into deanq/ae-2158-default-min-max-workers
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replace hardcoded
workersMinandworkersMaxvalues withDEFAULT_WORKERS_MINandDEFAULT_WORKERS_MAXconstants throughout the codebase for consistency and maintainability.Changes
DEFAULT_WORKERS_MIN=0andDEFAULT_WORKERS_MAX=1runpod_flashpackage for cleaner importsNo Breaking Changes
User worker behavior preserved:
workersMin=0enables scale-to-zero for cost optimizationworkersMax=1provides conservative scaling limitMothership flexibility:
DEFAULT_WORKERS_MIN(currently 0)API Improvement
Constants now importable from main package:
Design Rationale
Why DEFAULT_WORKERS_MIN=0:
workersMin=1explicitlyWhy constants for mothership:
Test Results
Files Modified
src/runpod_flash/__init__.py- Added constants to TYPE_CHECKING,__getattr__, and__all__src/runpod_flash/core/resources/constants.py- Set DEFAULT_WORKERS_MIN to 0src/runpod_flash/core/resources/serverless.py- Model defaults use constantssrc/runpod_flash/cli/commands/build_utils/manifest.py- Mothership configs use constantssrc/runpod_flash/cli/utils/skeleton_template/mothership.py- Template uses constantssrc/runpod_flash/cli/utils/skeleton_template/workers/cpu/endpoint.py- Template uses constantssrc/runpod_flash/cli/utils/skeleton_template/workers/gpu/endpoint.py- Template uses constantsscripts/test-image-constants.py- Updated validation for new behaviortests/unit/cli/commands/build_utils/test_manifest_mothership.py- Tests use constantsTest Plan
make quality-check)python scripts/test-image-constants.py)Related