Conversation
train day sunday->tuesday
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
Walkthrough주간 자동 학습·배포 파이프라인과 Kaggle 기반 모델 학습/배포용 스크립트·노트북·워크플로우를 추가합니다. DB→Parquet 추출, Kaggle 데이터셋 업로드, 노트북 푸시/상태 폴링, 가중치 다운로드, SSH/SCP로 서버 배포 흐름을 구현합니다. Changes
Sequence Diagram(s)sequenceDiagram
participant GHA as "GitHub Actions"
participant Pipeline as "weekly_routine.py"
participant DB as "PostgreSQL (SSH 터널)"
participant Parquet as "Parquet 저장소\n(AI/data/kaggle_data)"
participant Kaggle as "Kaggle (Datasets/Kernels)"
participant Weights as "로컬 가중치\n(AI/data/weights)"
participant Server as "프로덕션 서버\n(SSH/SCP)"
GHA->>Pipeline: 트리거 (cron / workflow_dispatch)
Pipeline->>DB: extract_to_parquet.py 실행 (SSH 터널 열기)
DB-->>Parquet: Parquet 파일 생성
Pipeline->>Kaggle: upload_to_kaggle.py 실행 (datasets version)
Pipeline->>Kaggle: trigger_training.py 실행 (kernels push + 상태 폴링)
Kaggle-->>Weights: 커널 실행 후 가중치 생성/출력
Pipeline->>Kaggle: download_weights.py 실행 (kaggle kernels output)
Kaggle-->>Weights: 가중치 로컬 다운로드
Pipeline->>Server: deploy_to_server.py 실행 (SSH 연결 및 SCP 전송)
Server-->>Server: 모델 파일 갱신
Pipeline-->>GHA: 완료 상태 반환
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 20
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
🟡 Minor comments (2)
AI/kaggle_notebooks/TCN/notebook.ipynb-60-62 (1)
60-62:⚠️ Potential issue | 🟡 Minor패키지 버전 명시 필요.
버전 미지정 시 재현성 문제가 발생할 수 있습니다. 특히
library_context에서 manifest가numpy<2.0을 요구한다고 명시되어 있으나, 현재 설치 명령은 최신 버전(2.0+)을 설치할 수 있습니다.🔧 버전 명시 제안
-"!pip install -q scikit-learn pandas numpy torch" +"!pip install -q scikit-learn pandas 'numpy<2.0' torch"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/kaggle_notebooks/TCN/notebook.ipynb` around lines 60 - 62, The pip install cell currently runs "!pip install -q scikit-learn pandas numpy torch" without pinned versions; update that cell (the pip install command string) to pin package versions to reproducible releases (e.g., ensure numpy is constrained to <2.0 as required by the library_context/manifest and pin compatible versions of scikit-learn, pandas, and torch) so the notebook installs deterministic, manifest-compatible packages.AI/scripts/deploy_to_server.py-76-77 (1)
76-77:⚠️ Potential issue | 🟡 MinorRSA 키만 지원 - Ed25519/ECDSA 키 사용 시 실패
RSAKey.from_private_key()는 RSA 키만 처리합니다. Ed25519나 ECDSA 키 사용 시 예외가 발생합니다.♻️ 다중 키 타입 지원
- # 환경변수로 받은 키 문자열을 파일처럼 사용 - private_key = paramiko.RSAKey.from_private_key(io.StringIO(SSH_KEY_STR)) + # 다양한 키 타입 지원 + key_file = io.StringIO(SSH_KEY_STR) + private_key = None + for key_class in (paramiko.Ed25519Key, paramiko.ECDSAKey, paramiko.RSAKey): + try: + key_file.seek(0) + private_key = key_class.from_private_key(key_file) + break + except paramiko.SSHException: + continue + if private_key is None: + raise ValueError("지원되지 않는 SSH 키 형식입니다.")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/scripts/deploy_to_server.py` around lines 76 - 77, The current code only calls paramiko.RSAKey.from_private_key(io.StringIO(SSH_KEY_STR)), which fails for Ed25519/ECDSA keys; update the logic that creates private_key to try parsing SSH_KEY_STR with multiple Paramiko key classes (e.g., paramiko.RSAKey.from_private_key, paramiko.ECDSAKey.from_private_key, paramiko.Ed25519Key.from_private_key, and paramiko.DSSKey.from_private_key) using io.StringIO(SSH_KEY_STR), catching exceptions and falling through until one succeeds, and raise a clear error if none parse; reference the private_key variable and the from_private_key calls to locate where to implement this fallback parsing.
🧹 Nitpick comments (9)
AI/kaggle_notebooks/TCN/notebook.ipynb (1)
33-42: git clone 실패 시 에러 처리 추가 권장.
subprocess.run()결과의returncode를 확인하지 않아 clone 실패 시 후속 셀에서 불명확한 에러가 발생할 수 있습니다.♻️ 에러 처리 추가 제안
"result = subprocess.run(\n", " ['git', 'clone', '--depth', '1',\n", " 'https://github.com/SISC-IT/sisc-web.git',\n", " '/kaggle/working/sisc-web'],\n", " capture_output=True, text=True\n", ")\n", "print(result.stdout)\n", -"print(result.stderr)" +"print(result.stderr)\n", +"if result.returncode != 0:\n", +" raise RuntimeError(f'Git clone 실패: {result.stderr}')"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/kaggle_notebooks/TCN/notebook.ipynb` around lines 33 - 42, The subprocess.run call that clones the repo doesn't check for errors; update the clone logic that uses subprocess.run and the result variable by either adding check=True to subprocess.run or explicitly inspecting result.returncode (and using result.stderr/result.stdout) after the call; if returncode != 0, log/print a clear error message including result.stderr and raise an exception (e.g., RuntimeError or re-raise CalledProcessError) so downstream cells don't proceed on a failed git clone..gitignore (1)
47-47:test_*.py전역 ignore는 범위가 너무 넓습니다.Line 47 때문에 저장소 어디서든 새 테스트 파일이 untracked 상태로 남습니다. 파이프라인/모델 테스트를 추가해도 CI에 못 올라갈 수 있으니, 임시 스크립트가 필요한 특정 디렉터리로 한정하는 편이 안전합니다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.gitignore at line 47, 현재 .gitignore의 전역 패턴 "test_*.py"가 저장소 전체의 테스트 파일을 무시해 CI에 올라가지 못할 위험이 있습니다; .gitignore에서 "test_*.py" 항목을 제거하거나 범위를 좁혀 특정 디렉터리로 한정(예: "scripts/test_*.py" 또는 "dev_tools/test_*.py" 등)하도록 수정하고, 변경한 패턴이 의도한 위치에서만 무시되는지 확인한 뒤 커밋하세요; 참조 대상: .gitignore의 "test_*.py" 패턴을 찾아 교체하거나 삭제하십시오.AI/modules/signal/models/PatchTST/wrapper.py (1)
191-193:get_signals()시그니처는 기본 계약과 맞춰 두는 편이 좋습니다.지금도 호출은 되지만,
BaseSignalModel과 다른 시그니처가 되면서 override/type-checking 신뢰도가 떨어집니다. 다른 래퍼들과 동일하게ticker_id,sector_id를 명시적으로 유지해 두는 쪽이 유지보수에 안전합니다.예시 수정
- def get_signals(self, df, **kwargs): + def get_signals(self, df: pd.DataFrame, ticker_id: int = 0, sector_id: int = 0) -> Dict[str, float]: """BaseSignalModel 추상 메서드 구현 - predict()로 위임""" return self.predict(df)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/modules/signal/models/PatchTST/wrapper.py` around lines 191 - 193, Update the get_signals method to match the BaseSignalModel contract by adding the explicit ticker_id and sector_id parameters (e.g., def get_signals(self, df, ticker_id=None, sector_id=None, **kwargs)) while keeping the current behavior of delegating to predict; ensure the method accepts those parameters and forwards the dataframe to self.predict(df) (and any necessary kwargs) so overrides/type-checking remain consistent with other wrappers and BaseSignalModel.AI/modules/signal/models/itransformer/train_kaggle.py (1)
326-327: 불필요한int()호출
len()은 이미 정수를 반환합니다.- "n_train_samples": int(len(X_train)), - "n_val_samples" : int(len(X_val)), + "n_train_samples": len(X_train), + "n_val_samples" : len(X_val),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/modules/signal/models/itransformer/train_kaggle.py` around lines 326 - 327, The dict entries for "n_train_samples" and "n_val_samples" wrap len(...) with an unnecessary int() call; remove the int() so use "n_train_samples": len(X_train) and "n_val_samples": len(X_val) (locate these keys in the training metadata construction in train_kaggle.py)..github/workflows/train.yml (2)
58-63: Kaggle 자격증명 JSON 생성 시 특수문자 처리 고려시크릿 값에 따옴표나 특수문자가 포함되면 JSON이 깨지거나 셸 인젝션이 발생할 수 있습니다.
jq를 사용하면 더 안전합니다.🛡️ 개선 제안
- name: Setup Kaggle credentials run: | mkdir -p ~/.kaggle - echo '{"username":"${{ secrets.KAGGLE_USERNAME }}","key":"${{ secrets.KAGGLE_KEY }}"}' \ - > ~/.kaggle/kaggle.json + jq -n --arg user "${{ secrets.KAGGLE_USERNAME }}" --arg key "${{ secrets.KAGGLE_KEY }}" \ + '{"username": $user, "key": $key}' > ~/.kaggle/kaggle.json chmod 600 ~/.kaggle/kaggle.json🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/train.yml around lines 58 - 63, Replace the fragile echo that constructs the Kaggle JSON with a safe JSON builder: use jq (or printf piped to jq -n --arg) to create the JSON object from the GitHub secrets and write it to ~/.kaggle/kaggle.json, then set chmod 600; specifically update the "Setup Kaggle credentials" step to pass ${{ secrets.KAGGLE_USERNAME }} and ${{ secrets.KAGGLE_KEY }} as jq arguments (instead of interpolating them into a raw string) to avoid JSON corruption or shell injection.
31-31: GitHub Actions 버전 업데이트 권장Static analysis에서
actions/checkout@v3와actions/setup-python@v4가 오래된 버전이라고 경고합니다. 최신 버전으로 업데이트하세요.♻️ 권장 수정
- name: Checkout repository - uses: actions/checkout@v3 + uses: actions/checkout@v4 # ──────────────────────────────────────── # 2. Python 환경 세팅 # ──────────────────────────────────────── - name: Setup Python - uses: actions/setup-python@v4 + uses: actions/setup-python@v5Also applies to: 37-37
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/train.yml at line 31, Update the GitHub Actions steps that reference actions/checkout@v3 and actions/setup-python@v4 to their current stable releases: locate the `uses: actions/checkout@v3` and `uses: actions/setup-python@v4` lines and bump them to the latest recommended major versions (e.g., actions/checkout@v4 and actions/setup-python@v5 or the most recent stable tags), then commit the change and run a workflow lint/validation to ensure compatibility with any inputs you rely on.AI/modules/signal/models/PatchTST/train_kaggle.py (2)
185-194: 예외 처리 개선: 예외 체인 유지 및 구체적 예외 타입 사용
raise ... from e를 사용하여 원본 예외 정보를 보존하세요. 디버깅 시 근본 원인 파악에 도움됩니다.♻️ 개선 제안
try: df_t = add_technical_indicators(df_t) df_t = add_multi_timeframe_features(df_t) processed.append(df_t) - except Exception as e: + except (ValueError, KeyError, TypeError) as e: fail_count += 1 print(f"\n[경고] {ticker} 피처 계산 실패 ({fail_count}/{fail_limit}): {e}") if fail_count >= fail_limit: - raise RuntimeError(f"피처 계산 실패가 {fail_limit}개를 초과했습니다.") + raise RuntimeError(f"피처 계산 실패가 {fail_limit}개를 초과했습니다.") from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/modules/signal/models/PatchTST/train_kaggle.py` around lines 185 - 194, The except block in the try around add_technical_indicators and add_multi_timeframe_features swallows the original exception; update the handler in train_kaggle.py to re-raise a RuntimeError using "raise RuntimeError(... ) from e" so the exception chain is preserved and include the original exception message, and consider narrowing the except to specific exceptions thrown by add_technical_indicators/add_multi_timeframe_features (e.g., ValueError/TypeError) instead of a bare Exception; keep the existing fail_count/fail_limit logic and logging but ensure the re-raise uses "from e".
321-321: 불필요한 f-string 접두사플레이스홀더가 없는 f-string입니다.
- print(f"\n>> 완료") + print("\n>> 완료")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/modules/signal/models/PatchTST/train_kaggle.py` at line 321, Replace the unnecessary f-string in the final print statement: locate the print call in train_kaggle.py that currently uses print(f"\n>> 완료") and change it to a normal string print (remove the leading "f") so it reads print("\n>> 완료"), leaving the rest of the function and surrounding code unchanged.AI/pipelines/weekly_routine.py (1)
30-30: 사용되지 않는 import:traceback
traceback모듈이 import되었지만 사용되지 않습니다.♻️ 제거
import subprocess -import traceback from datetime import datetime🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/pipelines/weekly_routine.py` at line 30, Remove the unused import of the traceback module: delete the "import traceback" line in AI/pipelines/weekly_routine.py (the unused symbol is traceback) so there are no unused imports left; run linters to confirm no other unused imports remain.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@AI/kaggle_notebooks/itransformer/notebook.ipynb`:
- Around line 73-90: The current logic may load and run train_kaggle.py twice
and attempts a fallback using a non-existent path; replace the exec/open(...)
path with a single importlib-based loader: check that train_script exists and if
not raise/print and exit immediately, otherwise use
importlib.util.spec_from_file_location('train_kaggle', train_script),
module_from_spec(spec), spec.loader.exec_module(module) and call module.train();
remove the exec(open(...).read()) and the erroneous else branch that reuses the
missing path so training is loaded and invoked exactly once via module.train().
In `@AI/kaggle_notebooks/patchtst/notebook.ipynb`:
- Around line 36-41: Remove the second and third notebook cells that copy/load
non-existent files and keep the first cell's clone + sys.path approach; instead
import PatchTST_Model and the feature functions directly from the cloned repo
(from AI.modules.signal.models.PatchTST.architecture import PatchTST_Model and
from AI.modules.features.legacy.technical_features import
add_technical_indicators, add_multi_timeframe_features), set any needed env vars
(e.g., PARQUET_DIR, WEIGHTS_DIR), and if you must run the training script
dynamically, load the repository's train_kaggle.py via
importlib.util.spec_from_file_location pointing to
/kaggle/working/sisc-web/AI/modules/signal/models/PatchTST/train_kaggle.py and
invoke its train() entry point instead of copying non-existent files; also
ensure kernel_sources in kernel-metadata.json is populated or left empty
consistently with this import strategy.
In `@AI/kaggle_notebooks/TCN/notebook.ipynb`:
- Around line 69-79: Replace the exec-based loading of train_script with an
importlib-based dynamic module import: when train_script exists, create a module
spec from that path (use importlib.util.spec_from_file_location and
importlib.util.module_from_spec), load the module via spec.loader.exec_module,
then call the exported train_model_kaggle function from the loaded module (guard
the call with hasattr/module attribute check and raise or log a clear error if
train_model_kaggle is missing), and remove the open(...).read() pattern so no
file descriptor is left open; keep the existing fallback print for the else
branch.
In `@AI/kaggle_notebooks/transformer/notebook.ipynb`:
- Around line 48-51: The code sets os.environ before importing or defining os
which causes a NameError; add an import os at the top of the cell (before any
use of os.environ) so the os module is available; locate the cell containing the
os.environ assignments (references to os.environ and the symbol os) and insert
import os as the first statement in that cell.
- Around line 36-39: 이 셀에서 발생하는 NameError/ModuleNotFoundError를 고치려면 먼저 shutil를
import하고 기술 지표 모듈을 올바른 경로로 가져오세요: 현재 코드의 from technical_features import
add_technical_indicators, add_multi_timeframe_features 대신 이미 사용중인
AI.modules.features.legacy.technical_features에서 add_technical_indicators와
add_multi_timeframe_features를 import하거나 기존 import(예:
AI.modules.features.legacy.technical_features)를 재사용하도록 수정하여 모듈 경로를 정확히 참조하고
shutil.copy 호출 전에 shutil이 import되었는지 확인하세요.
In `@AI/modules/signal/core/artifact_paths.py`:
- Around line 104-110: The resolver is returning lowercase "patchtst" paths
while training/saving uses "PatchTST", causing mismatched directories on
case-sensitive filesystems; update the handling in
AI/modules/signal/core/artifact_paths.py so that when normalized_model ==
"patchtst" you resolve to the same cased directory and filenames used by the
trainer (use "PatchTST" for resolved_model_dir and the matching artifact
filenames), i.e. change the resolved_model_dir, model_path and scaler_path
construction in the ModelArtifactPaths return so they reference "PatchTST" and
the exact saved filenames produced by AI/modules/signal/models/PatchTST/train.py
(use the same casing and basename as the saver).
In `@AI/modules/signal/models/itransformer/train_kaggle.py`:
- Around line 160-171: The scaler is being refit per ticker inside the for
ticker, group in df.groupby("ticker") loop (using scaler.fit_transform when
fit_scaler=True), causing leakage and inconsistent statistics; instead, if
fit_scaler is True, call scaler.fit once on the full dataset (e.g., on
df[available_feats].values.astype(np.float32)) before entering the loop, then
inside the loop only call scaler.transform on feat_vals; if fit_scaler is False,
ensure scaler is already fitted and continue using scaler.transform for
feat_vals (reference symbols: df.groupby("ticker"), available_feats, feat_vals,
fit_scaler, scaler.fit_transform, scaler.transform).
In `@AI/modules/signal/models/TCN/tcn_train_kaggle.py`:
- Around line 201-203: best_state is currently assigned a shallow reference to
model.state_dict(), so it gets mutated during later training and ends up
containing the final weights; replace assignments that save the best model
(where best_state = model.state_dict()) with a deep copy: import copy and set
best_state = copy.deepcopy(model.state_dict()) so best_state holds an
independent snapshot of parameters at the best validation loss.
In `@AI/modules/signal/models/TCN/wrapper.py`:
- Around line 217-233: The batch-prep loop uses get_standard_training_data()
directly which bypasses the same preprocessing used by predict(); update the
loop over ticker_data_map in the TCN wrapper to call
_prepare_dataframe(df.copy()) (or the same helper used by predict()) to perform
required column validation and inf/NaN cleaning before extracting
self.feature_columns, then proceed with seq_len checks, scaling
(self.scaler.transform) and appending to tensor_list and valid_tickers; ensure
you still handle exceptions and preserve the existing behavior around seq_len,
feature_columns, scaler, tensor_list and valid_tickers so batch and single-shot
prediction use identical preprocessing.
In `@AI/modules/signal/models/transformer/transformer_train_kaggle.py`:
- Around line 71-73: The saved artifact names in transformer_train_kaggle.py
('model_name' and 'scaler_name') use unsuffixed filenames which will not match
the resolver conventions in AI/modules/signal/core/artifact_paths.py (which
expects names like multi_horizon_model_prod.keras /
multi_horizon_scaler_prod.pkl or *_test for tests); update the 'model_name' and
'scaler_name' values to include the proper suffix (e.g., append _prod) so they
match the resolver lookup scheme and ensure the stored files can be found during
download/deploy.
- Around line 149-167: The current shuffle-based train_test_split after calling
loader.create_dataset causes temporal leakage; instead, perform a chronological
split on full_df before creating sequences so the scaler is fit only on training
data and validation windows are strictly later. Concretely: split full_df by a
date cutoff into train_df and val_df (no random shuffle), call
loader.create_dataset(train_df, ...) to produce X_ts_train, X_ticker_train,
X_sector_train, y_train and fit the scaler there, then call
loader.create_dataset(val_df, ..., scaler=trained_scaler or with a flag to only
transform) to produce X_ts_val, X_tick_val, X_sec_val, y_val; remove the later
train_test_split(...) shuffle=True usage and ensure horizons / n_outputs logic
remains unchanged.
In `@AI/pipelines/weekly_routine.py`:
- Around line 155-172: The script currently ignores run_weekly_pipeline()'s
result so the process always exits 0; capture its return or exception in the
__main__ block and propagate failure by exiting with a non-zero code: call
result = run_weekly_pipeline(...), then if result indicates failure (e.g., False
or a non-zero status) call sys.exit(1) or re-raise the caught exception;
alternatively wrap run_weekly_pipeline(...) in try/except and on exception log
the error and call sys.exit(1). Ensure you import sys if needed and reference
the existing run_weekly_pipeline call in the __main__ block.
In `@AI/scripts/deploy_to_server.py`:
- Around line 89-112: The deploy_model function currently always returns True
even if files are missing or transfer errors occur; update deploy_model to track
failures by creating a failed list/local flag: for each fname in model['files']
check os.path.exists(local_path) and append fname to failed if missing, wrap
scp.put(local_path, remote_path) in a try/except to append fname on exception,
and at the end return False if failed is non-empty (and optionally log which
filenames failed) otherwise return True; refer to deploy_model, model['files'],
local_path, remote_path, os.path.exists, scp.put and the printed messages when
implementing.
- Around line 73-74: Replace the unsafe AutoAddPolicy usage by enforcing host
key verification: instead of
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()), set
ssh.set_missing_host_key_policy(paramiko.RejectPolicy()) and load the known host
key from a secret (e.g. SSH_HOST_KEY) into the client before connecting; read
SSH_HOST_KEY from the environment/CI secret, parse it and add it with
ssh.get_host_keys().add(hostname, key_type, key_data) (or use
SSHClient.load_host_keys when writing the secret to a temp known_hosts file) so
the code around SSHClient() and set_missing_host_key_policy(...) performs strict
verification.
In `@AI/scripts/download_weights.py`:
- Around line 30-40: The download script currently can return success without
validating downloaded files, risking stale weights being reused; update the
logic in download_weights.py (referencing MODELS) to download each model into a
temporary directory first, verify presence and exact names of the expected files
(the same filenames deploy_to_server.py expects), only after validation
atomically replace/rsync into the model's dst_dir (empty/backup the dst_dir
before move), and fail/return False if any expected file is missing or names
differ; apply the same change to the code block around lines 61-80 that handles
additional models.
In `@AI/scripts/extract_to_parquet.py`:
- Around line 178-184: The SELECT in read_sql_safe currently omits two features
used downstream (correlation_spike and surprise_cpi), causing a mismatch with
AI/modules/signal/models/itransformer/train_kaggle.py; fix by adding
correlation_spike and surprise_cpi to the SELECT projection from
macroeconomic_indicators in extract_to_parquet.py (ensure read_sql_safe and
resulting DataFrame include these columns so they persist into the output
parquet).
- Around line 123-132: The retry loop leaks DB connections because conn.close()
is only called on success; update the loop around get_conn() and
pd.read_sql(query, conn) (inside the retry in extract_to_parquet.py) to always
close the connection on each attempt — either by using a try/finally that calls
conn.close() if conn is not None or by acquiring the connection with a context
manager — so every opened connection from get_conn() is closed even when
pd.read_sql throws.
In `@AI/scripts/trigger_training.py`:
- Around line 34-44: NOTEBOOKS entries use project_root joined to
"kaggle_notebooks/..." but the repo layout places notebooks under
AI/kaggle_notebooks, so update the NOTEBOOKS constant (the dicts for "PatchTST"
and "Transformer") to include the missing "AI" path segment when building
notebook_dir (e.g., include "AI" as an extra path component in os.path.join for
each notebook_dir) so trigger_notebook() can locate the folders.
- Around line 94-117: In the status-polling loop (the code that calls
subprocess.run(["kaggle", "kernels", "status", slug], ...) inside the
trigger_training flow), check result.returncode immediately after subprocess.run
and treat any non-zero return code as a hard failure: log/print the error
details (result.stderr or result.stdout), return False, and stop polling instead
of sleeping and retrying; keep the existing checks for
"complete"/"error"/"running" only when returncode == 0 to avoid unnecessary
5-minute retries (follow the same returncode handling pattern used in
trigger_notebook()).
In `@AI/scripts/upload_to_kaggle.py`:
- Around line 34-39: The subprocess.run call in upload_to_kaggle.py references
an undefined variable KAGGLE_DATA_DIR causing a NameError; change that argument
to use the existing OUTPUT_DIR (or rename OUTPUT_DIR to KAGGLE_DATA_DIR
consistently) so the call in subprocess.run([... "-p", KAGGLE_DATA_DIR, ...])
uses the defined upload directory; update any other occurrences to keep the
variable name consistent and ensure the kaggle upload command runs.
---
Minor comments:
In `@AI/kaggle_notebooks/TCN/notebook.ipynb`:
- Around line 60-62: The pip install cell currently runs "!pip install -q
scikit-learn pandas numpy torch" without pinned versions; update that cell (the
pip install command string) to pin package versions to reproducible releases
(e.g., ensure numpy is constrained to <2.0 as required by the
library_context/manifest and pin compatible versions of scikit-learn, pandas,
and torch) so the notebook installs deterministic, manifest-compatible packages.
In `@AI/scripts/deploy_to_server.py`:
- Around line 76-77: The current code only calls
paramiko.RSAKey.from_private_key(io.StringIO(SSH_KEY_STR)), which fails for
Ed25519/ECDSA keys; update the logic that creates private_key to try parsing
SSH_KEY_STR with multiple Paramiko key classes (e.g.,
paramiko.RSAKey.from_private_key, paramiko.ECDSAKey.from_private_key,
paramiko.Ed25519Key.from_private_key, and paramiko.DSSKey.from_private_key)
using io.StringIO(SSH_KEY_STR), catching exceptions and falling through until
one succeeds, and raise a clear error if none parse; reference the private_key
variable and the from_private_key calls to locate where to implement this
fallback parsing.
---
Nitpick comments:
In @.github/workflows/train.yml:
- Around line 58-63: Replace the fragile echo that constructs the Kaggle JSON
with a safe JSON builder: use jq (or printf piped to jq -n --arg) to create the
JSON object from the GitHub secrets and write it to ~/.kaggle/kaggle.json, then
set chmod 600; specifically update the "Setup Kaggle credentials" step to pass
${{ secrets.KAGGLE_USERNAME }} and ${{ secrets.KAGGLE_KEY }} as jq arguments
(instead of interpolating them into a raw string) to avoid JSON corruption or
shell injection.
- Line 31: Update the GitHub Actions steps that reference actions/checkout@v3
and actions/setup-python@v4 to their current stable releases: locate the `uses:
actions/checkout@v3` and `uses: actions/setup-python@v4` lines and bump them to
the latest recommended major versions (e.g., actions/checkout@v4 and
actions/setup-python@v5 or the most recent stable tags), then commit the change
and run a workflow lint/validation to ensure compatibility with any inputs you
rely on.
In @.gitignore:
- Line 47: 현재 .gitignore의 전역 패턴 "test_*.py"가 저장소 전체의 테스트 파일을 무시해 CI에 올라가지 못할 위험이
있습니다; .gitignore에서 "test_*.py" 항목을 제거하거나 범위를 좁혀 특정 디렉터리로 한정(예:
"scripts/test_*.py" 또는 "dev_tools/test_*.py" 등)하도록 수정하고, 변경한 패턴이 의도한 위치에서만 무시되는지
확인한 뒤 커밋하세요; 참조 대상: .gitignore의 "test_*.py" 패턴을 찾아 교체하거나 삭제하십시오.
In `@AI/kaggle_notebooks/TCN/notebook.ipynb`:
- Around line 33-42: The subprocess.run call that clones the repo doesn't check
for errors; update the clone logic that uses subprocess.run and the result
variable by either adding check=True to subprocess.run or explicitly inspecting
result.returncode (and using result.stderr/result.stdout) after the call; if
returncode != 0, log/print a clear error message including result.stderr and
raise an exception (e.g., RuntimeError or re-raise CalledProcessError) so
downstream cells don't proceed on a failed git clone.
In `@AI/modules/signal/models/itransformer/train_kaggle.py`:
- Around line 326-327: The dict entries for "n_train_samples" and
"n_val_samples" wrap len(...) with an unnecessary int() call; remove the int()
so use "n_train_samples": len(X_train) and "n_val_samples": len(X_val) (locate
these keys in the training metadata construction in train_kaggle.py).
In `@AI/modules/signal/models/PatchTST/train_kaggle.py`:
- Around line 185-194: The except block in the try around
add_technical_indicators and add_multi_timeframe_features swallows the original
exception; update the handler in train_kaggle.py to re-raise a RuntimeError
using "raise RuntimeError(... ) from e" so the exception chain is preserved and
include the original exception message, and consider narrowing the except to
specific exceptions thrown by
add_technical_indicators/add_multi_timeframe_features (e.g.,
ValueError/TypeError) instead of a bare Exception; keep the existing
fail_count/fail_limit logic and logging but ensure the re-raise uses "from e".
- Line 321: Replace the unnecessary f-string in the final print statement:
locate the print call in train_kaggle.py that currently uses print(f"\n>> 완료")
and change it to a normal string print (remove the leading "f") so it reads
print("\n>> 완료"), leaving the rest of the function and surrounding code
unchanged.
In `@AI/modules/signal/models/PatchTST/wrapper.py`:
- Around line 191-193: Update the get_signals method to match the
BaseSignalModel contract by adding the explicit ticker_id and sector_id
parameters (e.g., def get_signals(self, df, ticker_id=None, sector_id=None,
**kwargs)) while keeping the current behavior of delegating to predict; ensure
the method accepts those parameters and forwards the dataframe to
self.predict(df) (and any necessary kwargs) so overrides/type-checking remain
consistent with other wrappers and BaseSignalModel.
In `@AI/pipelines/weekly_routine.py`:
- Line 30: Remove the unused import of the traceback module: delete the "import
traceback" line in AI/pipelines/weekly_routine.py (the unused symbol is
traceback) so there are no unused imports left; run linters to confirm no other
unused imports remain.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 583ecab4-5362-4c58-9617-171cf32089f0
📒 Files selected for processing (23)
.github/workflows/train.yml.gitignoreAI/kaggle_notebooks/TCN/kernel-metadata.jsonAI/kaggle_notebooks/TCN/notebook.ipynbAI/kaggle_notebooks/itransformer/kernel-metadata.jsonAI/kaggle_notebooks/itransformer/notebook.ipynbAI/kaggle_notebooks/patchtst/kernel-metadata.jsonAI/kaggle_notebooks/patchtst/notebook.ipynbAI/kaggle_notebooks/transformer/kernel-metadata.jsonAI/kaggle_notebooks/transformer/notebook.ipynbAI/modules/signal/core/artifact_paths.pyAI/modules/signal/models/PatchTST/train_kaggle.pyAI/modules/signal/models/PatchTST/wrapper.pyAI/modules/signal/models/TCN/tcn_train_kaggle.pyAI/modules/signal/models/TCN/wrapper.pyAI/modules/signal/models/itransformer/train_kaggle.pyAI/modules/signal/models/transformer/transformer_train_kaggle.pyAI/pipelines/weekly_routine.pyAI/scripts/deploy_to_server.pyAI/scripts/download_weights.pyAI/scripts/extract_to_parquet.pyAI/scripts/trigger_training.pyAI/scripts/upload_to_kaggle.py
| "shutil.copy('/kaggle/working/transformer_architecture.py', '/kaggle/working/architecture.py')\n", | ||
| "\n", | ||
| "from technical_features import add_technical_indicators, add_multi_timeframe_features\n", | ||
| "print('>> import 완료')" |
There was a problem hiding this comment.
이 셀은 그대로는 NameError/ModuleNotFoundError로 멈춥니다.
shutil를 import하지 않았고, Line 38의 technical_features는 첫 셀에서 추가한 repo 경로 기준 top-level 모듈도 아닙니다. 이미 가져온 AI.modules.features.legacy.technical_features를 재사용하거나 여기서 정확한 모듈 경로로 import하세요.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/kaggle_notebooks/transformer/notebook.ipynb` around lines 36 - 39, 이 셀에서
발생하는 NameError/ModuleNotFoundError를 고치려면 먼저 shutil를 import하고 기술 지표 모듈을 올바른 경로로
가져오세요: 현재 코드의 from technical_features import add_technical_indicators,
add_multi_timeframe_features 대신 이미 사용중인
AI.modules.features.legacy.technical_features에서 add_technical_indicators와
add_multi_timeframe_features를 import하거나 기존 import(예:
AI.modules.features.legacy.technical_features)를 재사용하도록 수정하여 모듈 경로를 정확히 참조하고
shutil.copy 호출 전에 shutil이 import되었는지 확인하세요.
| for attempt in range(1, 4): | ||
| try: | ||
| conn = get_conn() | ||
| df = pd.read_sql(query, conn) | ||
| conn.close() | ||
| return df | ||
| except Exception as e: | ||
| print(f" [시도 {attempt}/3] 실패: {e}") | ||
| time.sleep(3) | ||
| raise RuntimeError(f"'{desc}' 쿼리 3회 모두 실패") |
There was a problem hiding this comment.
재시도 경로에서 DB 연결이 누수됩니다.
Line 127의 conn.close()는 성공 케이스에서만 실행됩니다. pd.read_sql()이 예외를 던지면 연결이 열린 채 남아서, 재시도나 여러 테이블 추출 중에 커넥션 한도를 빠르게 소모할 수 있습니다.
수정 예시
def read_sql_safe(query: str, desc: str = "") -> pd.DataFrame:
"""연결 끊김 시 최대 3회 재시도"""
for attempt in range(1, 4):
+ conn = None
try:
conn = get_conn()
df = pd.read_sql(query, conn)
- conn.close()
return df
except Exception as e:
print(f" [시도 {attempt}/3] 실패: {e}")
time.sleep(3)
+ finally:
+ if conn is not None:
+ conn.close()
raise RuntimeError(f"'{desc}' 쿼리 3회 모두 실패")🧰 Tools
🪛 Ruff (0.15.7)
[warning] 129-129: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/scripts/extract_to_parquet.py` around lines 123 - 132, The retry loop
leaks DB connections because conn.close() is only called on success; update the
loop around get_conn() and pd.read_sql(query, conn) (inside the retry in
extract_to_parquet.py) to always close the connection on each attempt — either
by using a try/finally that calls conn.close() if conn is not None or by
acquiring the connection with a context manager — so every opened connection
from get_conn() is closed even when pd.read_sql throws.
| df = read_sql_safe(""" | ||
| SELECT date, cpi, gdp, interest_rate, unemployment_rate, | ||
| us10y, us2y, yield_spread, vix_close, dxy_close, | ||
| wti_price, gold_price, credit_spread_hy | ||
| FROM macroeconomic_indicators | ||
| ORDER BY date | ||
| """, "macroeconomic_indicators") |
There was a problem hiding this comment.
거시 지표 추출 컬럼이 downstream 학습 스크립트와 맞지 않습니다.
AI/modules/signal/models/itransformer/train_kaggle.py:115-120은 correlation_spike와 surprise_cpi까지 feature 후보로 봅니다. 여기서 둘 다 빼면 Kaggle 학습에서는 조용히 누락된 채로 학습돼서, 기존 파이프라인과 feature set이 달라집니다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/scripts/extract_to_parquet.py` around lines 178 - 184, The SELECT in
read_sql_safe currently omits two features used downstream (correlation_spike
and surprise_cpi), causing a mismatch with
AI/modules/signal/models/itransformer/train_kaggle.py; fix by adding
correlation_spike and surprise_cpi to the SELECT projection from
macroeconomic_indicators in extract_to_parquet.py (ensure read_sql_safe and
resulting DataFrame include these columns so they persist into the output
parquet).
| result = subprocess.run( | ||
| ["kaggle", "kernels", "status", slug], | ||
| capture_output=True, | ||
| text=True | ||
| ) | ||
|
|
||
| output = result.stdout.lower() | ||
|
|
||
| if "complete" in output: | ||
| print(f" [{notebook['name']}] 학습 완료! ✅") | ||
| return True | ||
| elif "error" in output or "failed" in output: | ||
| print(f" [{notebook['name']}] 학습 실패! ❌") | ||
| print(result.stdout) | ||
| return False | ||
| elif "running" in output: | ||
| check_count += 1 | ||
| elapsed = check_count * 5 | ||
| print(f" [{notebook['name']}] 학습 중... ({elapsed}분 경과)") | ||
| else: | ||
| check_count += 1 | ||
| print(f" [{notebook['name']}] 상태: {result.stdout.strip()}") | ||
|
|
||
| time.sleep(300) # 5분 대기 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -name "trigger_training.py" -type fRepository: SISC-IT/sisc-web
Length of output: 92
🏁 Script executed:
cat -n ./AI/scripts/trigger_training.py | head -150 | tail -70Repository: SISC-IT/sisc-web
Length of output: 2361
🏁 Script executed:
rg "subprocess\.run" ./AI/scripts/trigger_training.py -A 5Repository: SISC-IT/sisc-web
Length of output: 366
🏁 Script executed:
cat -n ./AI/scripts/trigger_training.py | head -80 | tail -40Repository: SISC-IT/sisc-web
Length of output: 1624
상태 폴링에서 CLI 오류를 즉시 실패로 처리해야 합니다.
kaggle kernels status 명령이 인증, 네트워크, 또는 slug 문제로 non-zero를 반환해도 현재 코드는 stdout만 검사하고 5분씩 재시도합니다. GitHub Actions에서 최대 12시간까지 불필요한 재시도가 발생할 수 있습니다. returncode를 먼저 확인하고 즉시 중단하세요. (참고: 같은 파일의 trigger_notebook() 함수는 이미 returncode를 올바르게 검사합니다.)
수정 예시
result = subprocess.run(
["kaggle", "kernels", "status", slug],
capture_output=True,
text=True
)
+ if result.returncode != 0:
+ print(f" [{notebook['name']}] 상태 조회 실패! ❌")
+ print(result.stderr.strip())
+ return False
output = result.stdout.lower()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| result = subprocess.run( | |
| ["kaggle", "kernels", "status", slug], | |
| capture_output=True, | |
| text=True | |
| ) | |
| output = result.stdout.lower() | |
| if "complete" in output: | |
| print(f" [{notebook['name']}] 학습 완료! ✅") | |
| return True | |
| elif "error" in output or "failed" in output: | |
| print(f" [{notebook['name']}] 학습 실패! ❌") | |
| print(result.stdout) | |
| return False | |
| elif "running" in output: | |
| check_count += 1 | |
| elapsed = check_count * 5 | |
| print(f" [{notebook['name']}] 학습 중... ({elapsed}분 경과)") | |
| else: | |
| check_count += 1 | |
| print(f" [{notebook['name']}] 상태: {result.stdout.strip()}") | |
| time.sleep(300) # 5분 대기 | |
| result = subprocess.run( | |
| ["kaggle", "kernels", "status", slug], | |
| capture_output=True, | |
| text=True | |
| ) | |
| if result.returncode != 0: | |
| print(f" [{notebook['name']}] 상태 조회 실패! ❌") | |
| print(result.stderr.strip()) | |
| return False | |
| output = result.stdout.lower() | |
| if "complete" in output: | |
| print(f" [{notebook['name']}] 학습 완료! ✅") | |
| return True | |
| elif "error" in output or "failed" in output: | |
| print(f" [{notebook['name']}] 학습 실패! ❌") | |
| print(result.stdout) | |
| return False | |
| elif "running" in output: | |
| check_count += 1 | |
| elapsed = check_count * 5 | |
| print(f" [{notebook['name']}] 학습 중... ({elapsed}분 경과)") | |
| else: | |
| check_count += 1 | |
| print(f" [{notebook['name']}] 상태: {result.stdout.strip()}") | |
| time.sleep(300) # 5분 대기 |
🧰 Tools
🪛 Ruff (0.15.7)
[error] 94-94: subprocess call: check for execution of untrusted input
(S603)
[error] 95-95: Starting a process with a partial executable path
(S607)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/scripts/trigger_training.py` around lines 94 - 117, In the status-polling
loop (the code that calls subprocess.run(["kaggle", "kernels", "status", slug],
...) inside the trigger_training flow), check result.returncode immediately
after subprocess.run and treat any non-zero return code as a hard failure:
log/print the error details (result.stderr or result.stdout), return False, and
stop polling instead of sleeping and retrying; keep the existing checks for
"complete"/"error"/"running" only when returncode == 0 to avoid unnecessary
5-minute retries (follow the same returncode handling pattern used in
trigger_notebook()).
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
.github/workflows/train.yml (1)
35-44: 의존성 버전 고정 및 캐싱 권장.의존성 버전이 고정되지 않아 빌드 재현성 문제가 발생할 수 있습니다. 또한
setup-python액션의 캐싱 기능을 활용하면 빌드 속도를 개선할 수 있습니다.♻️ 권장 수정
requirements-train.txt파일을 생성하여 버전을 고정하거나, 최소한 주요 버전을 명시하는 것을 권장합니다:- name: Setup Python uses: actions/setup-python@v5 with: python-version: '3.10' + cache: 'pip' - name: Install dependencies run: | pip install \ - kaggle \ - pandas \ - pyarrow \ - psycopg2-binary \ - paramiko \ - sshtunnel \ - scp + kaggle>=1.6.0 \ + pandas>=2.0.0 \ + pyarrow>=14.0.0 \ + psycopg2-binary>=2.9.0 \ + paramiko>=3.0.0 \ + sshtunnel>=0.4.0 \ + scp>=0.14.0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/train.yml around lines 35 - 44, The Install dependencies step currently runs pip install with unpinned packages; create a requirements-train.txt listing the exact package versions (or minimum major versions) for kaggle, pandas, pyarrow, psycopg2-binary, paramiko, sshtunnel, scp and update the "Install dependencies" step to pip install -r requirements-train.txt; also enable caching by using the setup-python/cache or actions/cache keyed on the Python version and requirements-train.txt (refer to the existing setup-python action and the "Install dependencies" step) to speed builds and improve reproducibility.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/train.yml:
- Around line 27-33: Update the GitHub Actions steps that reference
actions/checkout@v3 and actions/setup-python@v4 to their current supported
releases (e.g., actions/checkout@v4 and actions/setup-python@v5 or the latest
stable major versions); locate the steps that use the strings
"actions/checkout@v3" and "actions/setup-python@v4" in the workflow and replace
them with the updated action versions, then run a workflow lint or dry-run to
verify compatibility.
- Around line 46-51: Replace the fragile echo JSON creation in the "Setup Kaggle
credentials" step with a jq-based safe generator: capture the secrets
($KAGGLE_USERNAME and $KAGGLE_KEY) into environment variables and run something
like jq -n --arg u "$KAGGLE_USERNAME" --arg k "$KAGGLE_KEY"
'{username:$u,key:$k}' > ~/.kaggle/kaggle.json, then chmod 600
~/.kaggle/kaggle.json; this ensures special characters in
KAGGLE_KEY/KAGGLE_USERNAME are properly escaped and produces valid JSON.
---
Nitpick comments:
In @.github/workflows/train.yml:
- Around line 35-44: The Install dependencies step currently runs pip install
with unpinned packages; create a requirements-train.txt listing the exact
package versions (or minimum major versions) for kaggle, pandas, pyarrow,
psycopg2-binary, paramiko, sshtunnel, scp and update the "Install dependencies"
step to pip install -r requirements-train.txt; also enable caching by using the
setup-python/cache or actions/cache keyed on the Python version and
requirements-train.txt (refer to the existing setup-python action and the
"Install dependencies" step) to speed builds and improve reproducibility.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6b1f2880-10f3-48f4-85bb-c96eb2f1744e
📒 Files selected for processing (1)
.github/workflows/train.yml
| - name: Checkout repository | ||
| uses: actions/checkout@v3 | ||
|
|
||
| - name: Setup Python | ||
| uses: actions/setup-python@v4 | ||
| with: | ||
| python-version: '3.10' |
There was a problem hiding this comment.
GitHub Actions 버전 업데이트 필요.
actions/checkout@v3와 actions/setup-python@v4는 더 이상 최신 GitHub Actions 러너에서 지원되지 않을 수 있습니다. 최신 버전으로 업데이트하세요.
🔧 권장 수정
- name: Checkout repository
- uses: actions/checkout@v3
+ uses: actions/checkout@v4
- name: Setup Python
- uses: actions/setup-python@v4
+ uses: actions/setup-python@v5
with:
python-version: '3.10'📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: Checkout repository | |
| uses: actions/checkout@v3 | |
| - name: Setup Python | |
| uses: actions/setup-python@v4 | |
| with: | |
| python-version: '3.10' | |
| - name: Checkout repository | |
| uses: actions/checkout@v4 | |
| - name: Setup Python | |
| uses: actions/setup-python@v5 | |
| with: | |
| python-version: '3.10' |
🧰 Tools
🪛 actionlint (1.7.11)
[error] 28-28: the runner of "actions/checkout@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
[error] 31-31: the runner of "actions/setup-python@v4" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/train.yml around lines 27 - 33, Update the GitHub Actions
steps that reference actions/checkout@v3 and actions/setup-python@v4 to their
current supported releases (e.g., actions/checkout@v4 and
actions/setup-python@v5 or the latest stable major versions); locate the steps
that use the strings "actions/checkout@v3" and "actions/setup-python@v4" in the
workflow and replace them with the updated action versions, then run a workflow
lint or dry-run to verify compatibility.
| - name: Setup Kaggle credentials | ||
| run: | | ||
| mkdir -p ~/.kaggle | ||
| echo '{"username":"${{ secrets.KAGGLE_USERNAME }}","key":"${{ secrets.KAGGLE_KEY }}"}' \ | ||
| > ~/.kaggle/kaggle.json | ||
| chmod 600 ~/.kaggle/kaggle.json |
There was a problem hiding this comment.
Kaggle 인증 JSON 생성 방식 개선 필요.
현재 방식은 KAGGLE_KEY에 특수 문자(따옴표, 백슬래시 등)가 포함된 경우 JSON이 손상될 수 있습니다. jq를 사용하여 안전하게 JSON을 생성하는 것을 권장합니다.
🛡️ 권장 수정
- name: Setup Kaggle credentials
run: |
mkdir -p ~/.kaggle
- echo '{"username":"${{ secrets.KAGGLE_USERNAME }}","key":"${{ secrets.KAGGLE_KEY }}"}' \
- > ~/.kaggle/kaggle.json
+ jq -n \
+ --arg user "${{ secrets.KAGGLE_USERNAME }}" \
+ --arg key "${{ secrets.KAGGLE_KEY }}" \
+ '{"username": $user, "key": $key}' > ~/.kaggle/kaggle.json
chmod 600 ~/.kaggle/kaggle.jsonjq는 ubuntu-latest 러너에 기본 설치되어 있으므로 추가 설치가 필요 없습니다.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: Setup Kaggle credentials | |
| run: | | |
| mkdir -p ~/.kaggle | |
| echo '{"username":"${{ secrets.KAGGLE_USERNAME }}","key":"${{ secrets.KAGGLE_KEY }}"}' \ | |
| > ~/.kaggle/kaggle.json | |
| chmod 600 ~/.kaggle/kaggle.json | |
| - name: Setup Kaggle credentials | |
| run: | | |
| mkdir -p ~/.kaggle | |
| jq -n \ | |
| --arg user "${{ secrets.KAGGLE_USERNAME }}" \ | |
| --arg key "${{ secrets.KAGGLE_KEY }}" \ | |
| '{"username": $user, "key": $key}' > ~/.kaggle/kaggle.json | |
| chmod 600 ~/.kaggle/kaggle.json |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/train.yml around lines 46 - 51, Replace the fragile echo
JSON creation in the "Setup Kaggle credentials" step with a jq-based safe
generator: capture the secrets ($KAGGLE_USERNAME and $KAGGLE_KEY) into
environment variables and run something like jq -n --arg u "$KAGGLE_USERNAME"
--arg k "$KAGGLE_KEY" '{username:$u,key:$k}' > ~/.kaggle/kaggle.json, then chmod
600 ~/.kaggle/kaggle.json; this ensures special characters in
KAGGLE_KEY/KAGGLE_USERNAME are properly escaped and produces valid JSON.
There was a problem hiding this comment.
Actionable comments posted: 7
♻️ Duplicate comments (1)
AI/scripts/trigger_training.py (1)
92-98:⚠️ Potential issue | 🟠 Major상태 조회 오류는 즉시 실패 처리하세요.
kaggle kernels status가 non-zero를 반환해도 지금은stdout만 보고 5분씩 재시도합니다. 인증/slug/네트워크 문제면 최대 12시간 동안 불필요하게 대기하므로returncode를 먼저 검사하고 바로False를 반환하는 편이 안전합니다.예시 수정
while check_count < max_checks: result = subprocess.run( ["kaggle", "kernels", "status", slug], capture_output=True, text=True ) + if result.returncode != 0: + print(f" [{notebook['name']}] 상태 조회 실패! ❌") + print((result.stderr or result.stdout).strip()) + return False output = result.stdout.lower()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/scripts/trigger_training.py` around lines 92 - 98, The code currently inspects result.stdout regardless of process success; change the logic after the subprocess.run(...) call to first check result.returncode and immediately return False (and log an error including returncode/stderr) if it's non-zero, instead of continuing into the 5-minute retry loop; update the logic around the subprocess.run(...) invocation and the subsequent use of result/output to bail out on failed `kaggle kernels status` calls.
🧹 Nitpick comments (1)
AI/scripts/trigger_training.py (1)
65-73: 커널 ID의 단일 소스를 두세요.
kernels push -p는 각notebook_dir의kernel-metadata.json을 기준으로 대상을 정하지만, 성공 URL과 상태 조회는KAGGLE_USERNAME으로 다시 조합하고 있습니다. owner가 바뀌거나 metadata와 env가 어긋나면 push는 성공해도 polling은 다른 커널을 보게 되니, metadata의id를 읽어 그대로 재사용하는 편이 안전합니다.Also applies to: 81-88
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/scripts/trigger_training.py` around lines 65 - 73, The current success URL and later polling reconstruct the kernel identifier from KAGGLE_USERNAME and notebook['slug'], which can diverge from the pushed kernel; instead, after running subprocess.run(...) to push (the kernels push -p using notebook['notebook_dir']), open and parse the kernel-metadata.json in notebook['notebook_dir'] to read the canonical kernel "id" and use that id for both the success message in the block that prints notebook['name'] and for any subsequent polling logic (the later block around lines 81-88), replacing usage of KAGGLE_USERNAME/notebook['slug'] with the metadata id to ensure a single authoritative source for the kernel identifier.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@AI/modules/signal/models/itransformer/train_kaggle.py`:
- Around line 130-132: The global df.ffill() after merging price_df and macro_df
causes feature leakage across tickers; change the fill strategy to forward-fill
macro columns only within each ticker group and then fill remaining NaNs with 0.
Locate the merged dataframe operations around df = pd.merge(price_df, macro_df,
on="date", how="left"), df = df.sort_values(["ticker",
"date"]).reset_index(drop=True) and replace the df.ffill().fillna(0) step with a
grouped forward-fill for macro columns (e.g., using
df.groupby("ticker")[macro_cols].ffill()) followed by df.fillna(0) to handle
other columns.
- Around line 49-51: The save filenames in train_kaggle.py ("model_name",
"scaler_name", "metadata_name") use the new itransformer_* names but downstream
code (artifact_paths.py, itransformer/wrapper.py, deploy_to_server.py) still
expects the old multi_horizon_* names (e.g., multi_horizon_model.keras,
multi_horizon_scaler.pkl); pick one convention and make it consistent: either
change the keys in train_kaggle.py back to
"model_name":"multi_horizon_model.keras" and
"scaler_name":"multi_horizon_scaler.pkl" (and "metadata_name" if applicable) or
update artifact_paths.py, itransformer/wrapper.py (around the loader at
wrapper.py:339) and deploy_to_server.py to look for itransformer_model.keras /
itransformer_scaler.pkl / itransformer_metadata.json; ensure all references to
model_name/scaler_name/metadata_name across these modules match exactly.
In `@AI/modules/signal/models/TCN/tcn_train_kaggle.py`:
- Line 260: Remove the unnecessary f-string used in the print statement by
replacing the literal print(f"\n>> 완료") with a plain string print("\n>> 완료");
locate the exact statement print(f"\n>> 완료") in tcn_train_kaggle.py (inside the
TCN training/finishing block) and remove the leading f to satisfy Ruff F541
linting.
- Around line 269-270: 현재 parser.add_argument("--end-date",
default="2023-12-31")로 고정돼 있어 자동 재학습이 최신 데이터를 반영하지 못합니다;
parser.add_argument("--end-date", ...)의 기본값을 하드코딩된 날짜에서 동적값으로 바꿔 오늘
날짜(datetime.date.today().isoformat()) 또는 파켓 파일에서 읽어온 최대 날짜(max_date)를 기본값으로
사용하도록 변경하고, argparse 처리 로직에서 --end-date가 주어지지 않았을 때만 이 동적값을 사용하도록 구현하십시오 (예:
compute max_date_from_parquet() 후에 default_end = max(today, max_date)로 설정하거나,
argparse 이후 if args.end_date is None: args.end_date = default_end 방식으로 처리).
- Around line 119-141: The loop that calls add_technical_indicators silently
skips up to 19 failures and then later pd.concat(processed) can raise "No
objects to concatenate" without context; change the feature-calculation loop
(the for ticker in tqdm(...) block) to log each failing ticker and its exception
(include the ticker value and exception text) when add_technical_indicators
raises, keep fail_count but fail faster or at least include a list of
failed_tickers, and after the loop check if processed is empty (or number of
successful tickers is zero) and raise a clear RuntimeError that includes
start_date/end_date, total tickers attempted, fail_count, and the failed_tickers
list before attempting to create full_df; ensure references to processed,
fail_count, failed_tickers, add_technical_indicators, and full_df are updated
accordingly.
- Line 268: The default parquet path set in the
parser.add_argument("--parquet-dir", default=os.environ.get('PARQUET_DIR',
'/kaggle/input/datasets/jihyeongkimm/sisc-ai-trading-dataset')) is invalid for
Kaggle; update the logic so if PARQUET_DIR is not set the code either: 1) uses a
safe default root '/kaggle/input' (so users must provide the specific dataset
slug), or 2) dynamically detects the mounted dataset by scanning /kaggle/input
(e.g., os.walk('/kaggle/input') and picking the matching dataset folder) and
sets the --parquet-dir fallback accordingly; change the code around the
--parquet-dir argument handling to prefer os.environ['PARQUET_DIR'] if present,
otherwise perform the detection or fallback to '/kaggle/input'.
- Around line 155-160: The current split uses "split_date =
dates[split_date_idx]" and then train_df = raw_df[raw_df['date'] <= split_date]
which includes the split_date in the training set and can leave val_df empty;
change the split to make train use dates strictly before the split_date and val
use >= split_date (i.e., use raw_df[raw_df['date'] < split_date] for train_df
and raw_df[raw_df['date'] >= split_date] for val_df) so that the chosen
split_date becomes the first date in the validation set; update references to
dates, split_date_idx, split_date, train_df, and val_df accordingly.
---
Duplicate comments:
In `@AI/scripts/trigger_training.py`:
- Around line 92-98: The code currently inspects result.stdout regardless of
process success; change the logic after the subprocess.run(...) call to first
check result.returncode and immediately return False (and log an error including
returncode/stderr) if it's non-zero, instead of continuing into the 5-minute
retry loop; update the logic around the subprocess.run(...) invocation and the
subsequent use of result/output to bail out on failed `kaggle kernels status`
calls.
---
Nitpick comments:
In `@AI/scripts/trigger_training.py`:
- Around line 65-73: The current success URL and later polling reconstruct the
kernel identifier from KAGGLE_USERNAME and notebook['slug'], which can diverge
from the pushed kernel; instead, after running subprocess.run(...) to push (the
kernels push -p using notebook['notebook_dir']), open and parse the
kernel-metadata.json in notebook['notebook_dir'] to read the canonical kernel
"id" and use that id for both the success message in the block that prints
notebook['name'] and for any subsequent polling logic (the later block around
lines 81-88), replacing usage of KAGGLE_USERNAME/notebook['slug'] with the
metadata id to ensure a single authoritative source for the kernel identifier.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 88b44e3d-2441-4d9c-be85-f620b46bfdfa
📒 Files selected for processing (5)
AI/modules/signal/core/artifact_paths.pyAI/modules/signal/models/TCN/tcn_train_kaggle.pyAI/modules/signal/models/itransformer/train_kaggle.pyAI/scripts/trigger_training.pyAI/scripts/upload_to_kaggle.py
🚧 Files skipped from review as they are similar to previous changes (1)
- AI/modules/signal/core/artifact_paths.py
| "model_name" : "itransformer_model.keras", | ||
| "scaler_name" : "itransformer_scaler.pkl", | ||
| "metadata_name" : "itransformer_metadata.json", |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# iTransformer 아티팩트명이 downstream 소비 코드에 반영돼 있는지 확인합니다.
rg -n -C2 'itransformer_model\.keras|itransformer_scaler\.pkl|itransformer_metadata\.json|multi_horizon_model\.keras|multi_horizon_scaler\.pkl' AIRepository: SISC-IT/sisc-web
Length of output: 4064
저장 파일명 계약을 downstream과 맞춰주세요.
iTransformer 학습 스크립트(train_kaggle.py 49-51줄)는 itransformer_model.keras, itransformer_scaler.pkl, itransformer_metadata.json으로 저장하지만, downstream 코드(artifact_paths.py, itransformer/wrapper.py, deploy_to_server.py)는 여전히 구 규약(multi_horizon_model.keras, multi_horizon_scaler.pkl)을 인식합니다. 특히 wrapper.py 339줄에서 multi_horizon_scaler.pkl을 로드하려고 하므로 학습 완료 후 자동 로드/배포 단계에서 파일을 찾을 수 없어 실패합니다.
기존 규약으로 통일하거나, 모든 downstream 코드를 새로운 이Transformer 이름으로 수정해야 합니다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/modules/signal/models/itransformer/train_kaggle.py` around lines 49 - 51,
The save filenames in train_kaggle.py ("model_name", "scaler_name",
"metadata_name") use the new itransformer_* names but downstream code
(artifact_paths.py, itransformer/wrapper.py, deploy_to_server.py) still expects
the old multi_horizon_* names (e.g., multi_horizon_model.keras,
multi_horizon_scaler.pkl); pick one convention and make it consistent: either
change the keys in train_kaggle.py back to
"model_name":"multi_horizon_model.keras" and
"scaler_name":"multi_horizon_scaler.pkl" (and "metadata_name" if applicable) or
update artifact_paths.py, itransformer/wrapper.py (around the loader at
wrapper.py:339) and deploy_to_server.py to look for itransformer_model.keras /
itransformer_scaler.pkl / itransformer_metadata.json; ensure all references to
model_name/scaler_name/metadata_name across these modules match exactly.
| df = pd.merge(price_df, macro_df, on="date", how="left") | ||
| df = df.sort_values(["ticker", "date"]).reset_index(drop=True) | ||
| df = df.ffill().fillna(0) |
There was a problem hiding this comment.
전역 ffill()이 티커 간 feature 누수를 만듭니다.
df를 ["ticker", "date"]로 정렬한 뒤 ffill()을 쓰면, 다음 티커 첫 행의 log_return/ret_1d와 macro 결측치가 이전 티커 마지막 값으로 채워집니다. macro 값은 더 늦은 날짜가 더 이른 날짜로 섞일 수도 있어서 학습 데이터가 오염됩니다. macro 컬럼만 티커별로 forward-fill하고 나머지는 fillna(0)로 마무리하는 쪽이 안전합니다.
예시 수정
df = pd.merge(price_df, macro_df, on="date", how="left")
df = df.sort_values(["ticker", "date"]).reset_index(drop=True)
- df = df.ffill().fillna(0)
+ macro_fill_cols = [c for c in macro_df.columns if c != "date"]
+ if macro_fill_cols:
+ df[macro_fill_cols] = df.groupby("ticker")[macro_fill_cols].ffill()
+ df = df.fillna(0)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/modules/signal/models/itransformer/train_kaggle.py` around lines 130 -
132, The global df.ffill() after merging price_df and macro_df causes feature
leakage across tickers; change the fill strategy to forward-fill macro columns
only within each ticker group and then fill remaining NaNs with 0. Locate the
merged dataframe operations around df = pd.merge(price_df, macro_df, on="date",
how="left"), df = df.sort_values(["ticker", "date"]).reset_index(drop=True) and
replace the df.ffill().fillna(0) step with a grouped forward-fill for macro
columns (e.g., using df.groupby("ticker")[macro_cols].ffill()) followed by
df.fillna(0) to handle other columns.
| raw_df = raw_df[ | ||
| (raw_df['date'] >= start_date) & | ||
| (raw_df['date'] <= end_date) | ||
| ].copy() | ||
|
|
||
| print(f">> 로드 완료: {len(raw_df):,}행, {raw_df['ticker'].nunique()}개 종목") | ||
|
|
||
| # 피처 계산 | ||
| print(">> 피처 계산 중...") | ||
| processed = [] | ||
| fail_count = 0 | ||
|
|
||
| for ticker in tqdm(raw_df['ticker'].unique(), desc="피처 계산"): | ||
| df_t = raw_df[raw_df['ticker'] == ticker].copy() | ||
| try: | ||
| df_t = add_technical_indicators(df_t) | ||
| processed.append(df_t) | ||
| except Exception as e: | ||
| fail_count += 1 | ||
| if fail_count >= 20: | ||
| raise RuntimeError("피처 계산 실패가 20개를 초과했습니다.") | ||
|
|
||
| full_df = pd.concat(processed).reset_index(drop=True) |
There was a problem hiding this comment.
전처리 실패가 조용히 누락되면 부분 학습이나 모호한 실패로 끝납니다.
현재 구현은 add_technical_indicators() 실패를 최대 19개까지 그냥 건너뛰어서, 버그와 데이터 이슈를 구분하지 못한 채 일부 종목만으로 학습이 진행될 수 있습니다. 게다가 날짜 필터 결과가 비었거나 전부 실패하면 Line 141에서 No objects to concatenate만 남아 원인이 사라집니다.
🛠️ 제안 수정
raw_df = raw_df[
- (raw_df['date'] >= start_date) &
- (raw_df['date'] <= end_date)
+ (raw_df["date"] >= start_date) &
+ (raw_df["date"] <= end_date)
].copy()
+ if raw_df.empty:
+ raise ValueError("지정한 날짜 범위에 해당하는 데이터가 없습니다.")
@@
- except Exception as e:
+ except Exception as err:
fail_count += 1
+ print(f">> 피처 계산 실패: {ticker} ({fail_count}개) - {err}")
if fail_count >= 20:
- raise RuntimeError("피처 계산 실패가 20개를 초과했습니다.")
+ raise RuntimeError("피처 계산 실패가 20개를 초과했습니다.") from err
- full_df = pd.concat(processed).reset_index(drop=True)
+ if not processed:
+ raise ValueError("피처 계산에 성공한 종목이 없습니다.")
+ full_df = pd.concat(processed, ignore_index=True)🧰 Tools
🪛 Ruff (0.15.7)
[warning] 136-136: Do not catch blind exception: Exception
(BLE001)
[warning] 139-139: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/modules/signal/models/TCN/tcn_train_kaggle.py` around lines 119 - 141, The
loop that calls add_technical_indicators silently skips up to 19 failures and
then later pd.concat(processed) can raise "No objects to concatenate" without
context; change the feature-calculation loop (the for ticker in tqdm(...) block)
to log each failing ticker and its exception (include the ticker value and
exception text) when add_technical_indicators raises, keep fail_count but fail
faster or at least include a list of failed_tickers, and after the loop check if
processed is empty (or number of successful tickers is zero) and raise a clear
RuntimeError that includes start_date/end_date, total tickers attempted,
fail_count, and the failed_tickers list before attempting to create full_df;
ensure references to processed, fail_count, failed_tickers,
add_technical_indicators, and full_df are updated accordingly.
| dates = raw_df['date'].sort_values().unique() | ||
| split_date_idx = int(len(dates) * 0.8) | ||
| split_date = dates[split_date_idx] | ||
|
|
||
| train_df = raw_df[raw_df['date'] <= split_date].copy() | ||
| val_df = raw_df[raw_df['date'] > split_date].copy() |
There was a problem hiding this comment.
Train/Val 분할이 하루 밀려서 검증셋이 비어질 수 있습니다.
split_date = dates[int(len(dates) * 0.8)] 다음에 <= / >로 나누면 split date가 train 쪽에 포함됩니다. 예를 들어 unique date가 5개면 int(5 * 0.8) == 4라서 Line 160의 val이 바로 빈 DataFrame이 됩니다.
🛠️ 제안 수정
dates = raw_df['date'].sort_values().unique()
split_date_idx = int(len(dates) * 0.8)
- split_date = dates[split_date_idx]
+ if split_date_idx <= 0 or split_date_idx >= len(dates):
+ raise ValueError("Train/Val 분할을 위해 최소 2개 이상의 거래일이 필요합니다.")
+ train_dates = dates[:split_date_idx]
+ val_dates = dates[split_date_idx:]
- train_df = raw_df[raw_df['date'] <= split_date].copy()
- val_df = raw_df[raw_df['date'] > split_date].copy()
- print(f">> Train: ~{split_date}, Val: {split_date}~")
+ train_df = raw_df[raw_df['date'].isin(train_dates)].copy()
+ val_df = raw_df[raw_df['date'].isin(val_dates)].copy()
+ print(f">> Train: {train_dates[0]}~{train_dates[-1]}, Val: {val_dates[0]}~{val_dates[-1]}")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/modules/signal/models/TCN/tcn_train_kaggle.py` around lines 155 - 160, The
current split uses "split_date = dates[split_date_idx]" and then train_df =
raw_df[raw_df['date'] <= split_date] which includes the split_date in the
training set and can leave val_df empty; change the split to make train use
dates strictly before the split_date and val use >= split_date (i.e., use
raw_df[raw_df['date'] < split_date] for train_df and raw_df[raw_df['date'] >=
split_date] for val_df) so that the chosen split_date becomes the first date in
the validation set; update references to dates, split_date_idx, split_date,
train_df, and val_df accordingly.
| with open(metadata_path, "w", encoding="utf-8") as f: | ||
| json.dump(metadata, f, ensure_ascii=False, indent=2) | ||
|
|
||
| print(f"\n>> 완료") |
There was a problem hiding this comment.
불필요한 f prefix 때문에 Ruff F541가 납니다.
placeholder가 없는 문자열이라 f가 의미가 없습니다. lint gate가 걸려 있으면 여기서 CI가 막힙니다.
🛠️ 제안 수정
- print(f"\n>> 완료")
+ print("\n>> 완료")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| print(f"\n>> 완료") | |
| print("\n>> 완료") |
🧰 Tools
🪛 Ruff (0.15.7)
[error] 260-260: f-string without any placeholders
Remove extraneous f prefix
(F541)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/modules/signal/models/TCN/tcn_train_kaggle.py` at line 260, Remove the
unnecessary f-string used in the print statement by replacing the literal
print(f"\n>> 완료") with a plain string print("\n>> 완료"); locate the exact
statement print(f"\n>> 완료") in tcn_train_kaggle.py (inside the TCN
training/finishing block) and remove the leading f to satisfy Ruff F541 linting.
|
|
||
| def parse_args() -> argparse.Namespace: | ||
| parser = argparse.ArgumentParser(description="Train TCN signal model (Kaggle/Actions 버전)") | ||
| parser.add_argument("--parquet-dir", default=os.environ.get('PARQUET_DIR', '/kaggle/input/datasets/jihyeongkimm/sisc-ai-trading-dataset')) |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Kaggle Notebook에서 dataset을 attached 했을 때 기본 mount 경로는 무엇인가요? /kaggle/input/datasets//도 유효한 런타임 경로인가요, 아니면 기본값을 /kaggle/input/로 둬야 하나요?
💡 Result:
Kaggle Notebook에서 dataset을 attached 했을 때 기본 mount 경로는 /kaggle/input/ 입니다. 각 attached dataset은 /kaggle/input/ 아래에 dataset의 slug (또는 이름, owner/slug 형식)에 해당하는 디렉토리로 마운트됩니다. 예를 들어, "aviation-accident-database-synopses" dataset의 경우 /kaggle/input/aviation-accident-database-synopses/AviationData.csv 입니다. /kaggle/input/datasets// 형식은 표준이 아니며 유효하지 않습니다. 기본값은 /kaggle/input/ 또는 /kaggle/input/ 형식으로, 실제 경로는 os.walk('/kaggle/input')이나 notebook의 Data 탭에서 확인하는 것이 안전합니다. /kaggle/input/만 사용하는 것은 이름 충돌 가능성 때문에 권장되지 않습니다.
Citations:
- 1: https://www.kaggle.com/docs/notebooks
- 2: https://www.kaggle.com/code/dansbecker/finding-your-files-in-kaggle-kernels
- 3: https://www.kaggle.com/discussions/getting-started/25930
- 4: https://www.kaggle.com/discussions/product-feedback/31738
Kaggle 기본 입력 경로 수정 필요
현재 기본값 /kaggle/input/datasets/jihyeongkimm/sisc-ai-trading-dataset은 Kaggle의 표준 마운트 경로가 아닙니다. Kaggle Notebook에서 attached dataset은 /kaggle/input/<dataset-slug> 또는 /kaggle/input/<owner-dataset-slug> 형식으로만 마운트됩니다. /kaggle/input/datasets/ 경로는 유효하지 않으므로, PARQUET_DIR 환경 변수가 설정되지 않으면 스크립트가 실패합니다.
실제 마운트된 경로는 데이터셋의 정확한 slug를 확인하여 올바른 기본값으로 수정하거나, 또는 os.walk('/kaggle/input')로 실시간 확인하도록 변경하세요.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/modules/signal/models/TCN/tcn_train_kaggle.py` at line 268, The default
parquet path set in the parser.add_argument("--parquet-dir",
default=os.environ.get('PARQUET_DIR',
'/kaggle/input/datasets/jihyeongkimm/sisc-ai-trading-dataset')) is invalid for
Kaggle; update the logic so if PARQUET_DIR is not set the code either: 1) uses a
safe default root '/kaggle/input' (so users must provide the specific dataset
slug), or 2) dynamically detects the mounted dataset by scanning /kaggle/input
(e.g., os.walk('/kaggle/input') and picking the matching dataset folder) and
sets the --parquet-dir fallback accordingly; change the code around the
--parquet-dir argument handling to prefer os.environ['PARQUET_DIR'] if present,
otherwise perform the detection or fallback to '/kaggle/input'.
| parser.add_argument("--start-date", default="2015-01-01") | ||
| parser.add_argument("--end-date", default="2023-12-31") |
There was a problem hiding this comment.
기본 end-date가 2023-12-31로 고정돼 있어 자동 재학습이 최신 데이터를 못 봅니다.
오늘 날짜가 2026-03-27인데 현재 기본값이면 2024-01-01 이후 데이터는 override가 없는 모든 실행에서 계속 제외됩니다. 주간 파이프라인 용도라면 실행일이나 parquet의 최대 날짜를 기본값으로 잡는 편이 안전합니다.
🛠️ 제안 수정
parser.add_argument("--start-date", default="2015-01-01")
- parser.add_argument("--end-date", default="2023-12-31")
+ parser.add_argument(
+ "--end-date",
+ default=os.environ.get("END_DATE", pd.Timestamp.today().strftime("%Y-%m-%d")),
+ )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/modules/signal/models/TCN/tcn_train_kaggle.py` around lines 269 - 270, 현재
parser.add_argument("--end-date", default="2023-12-31")로 고정돼 있어 자동 재학습이 최신 데이터를
반영하지 못합니다; parser.add_argument("--end-date", ...)의 기본값을 하드코딩된 날짜에서 동적값으로 바꿔 오늘
날짜(datetime.date.today().isoformat()) 또는 파켓 파일에서 읽어온 최대 날짜(max_date)를 기본값으로
사용하도록 변경하고, argparse 처리 로직에서 --end-date가 주어지지 않았을 때만 이 동적값을 사용하도록 구현하십시오 (예:
compute max_date_from_parquet() 후에 default_end = max(today, max_date)로 설정하거나,
argparse 이후 if args.end_date is None: args.end_date = default_end 방식으로 처리).
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@AI/modules/signal/models/PatchTST/train.py`:
- Around line 217-219: The current split forces n_val = max(1, int(len(tickers)
* 0.2)) which makes train_tickers empty when len(tickers) == 1 and leads to
divide-by-zero later; change the logic around tickers, n_val and val_tickers so
n_val is at most len(tickers)-1 (e.g. n_val = min(max(1, int(len(tickers) *
0.2)), max(0, len(tickers) - 1))), and if len(tickers) < 2 either raise a clear
error or handle the single-ticker case explicitly (ensure train_tickers is
non-empty before proceeding and adjust downstream averaging logic in the
training loop that uses train_tickers/val_tickers).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3914cdca-d1bf-467b-adcb-47a133d02b6a
📒 Files selected for processing (1)
AI/modules/signal/models/PatchTST/train.py
| tickers = full_df['ticker'].unique() | ||
| n_val = max(1, int(len(tickers) * 0.2)) | ||
| val_tickers = tickers[-n_val:] # 마지막 20% 티커를 val로 (시간 순서 보존) |
There was a problem hiding this comment.
검증 분할 로직으로 학습 세트가 비어 즉시 실패할 수 있습니다.
Line 218에서 n_val을 최소 1로 강제해 ticker가 1개인 경우 train_tickers가 비게 됩니다. 이 상태로 진행하면 Line 291/301 평균 손실 계산에서 0으로 나누기 예외가 발생합니다. 분할 전에 최소 ticker 수를 검증하고, 분할 비율 상한도 보정하는 게 안전합니다.
수정 제안
- tickers = full_df['ticker'].unique()
- n_val = max(1, int(len(tickers) * 0.2))
- val_tickers = tickers[-n_val:] # 마지막 20% 티커를 val로 (시간 순서 보존)
+ tickers = full_df['ticker'].unique()
+ if len(tickers) < 2:
+ raise ValueError("train/val 분할을 위해 최소 2개 이상의 ticker가 필요합니다.")
+ n_val = min(max(1, int(len(tickers) * 0.2)), len(tickers) - 1)
+ val_tickers = tickers[-n_val:] # 마지막 20% 티커를 val로
train_tickers = tickers[:-n_val]
@@
X_train, y_train = build_sequences(train_df, scaler, fit_scaler=True)
X_val, y_val = build_sequences(val_df, scaler, fit_scaler=False)
+ if len(X_train) == 0 or len(X_val) == 0:
+ raise ValueError("시퀀스 생성 결과가 비어 있습니다. 분할/입력 데이터를 확인하세요.")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/modules/signal/models/PatchTST/train.py` around lines 217 - 219, The
current split forces n_val = max(1, int(len(tickers) * 0.2)) which makes
train_tickers empty when len(tickers) == 1 and leads to divide-by-zero later;
change the logic around tickers, n_val and val_tickers so n_val is at most
len(tickers)-1 (e.g. n_val = min(max(1, int(len(tickers) * 0.2)), max(0,
len(tickers) - 1))), and if len(tickers) < 2 either raise a clear error or
handle the single-ticker case explicitly (ensure train_tickers is non-empty
before proceeding and adjust downstream averaging logic in the training loop
that uses train_tickers/val_tickers).
개요
운영 서버 메모리 2GB 한계로 불가능했던 딥러닝 모델 학습을
Kaggle GPU를 활용해 자동화한 파이프라인 구축
Closes #343
변경사항
추가
수정
Summary by CodeRabbit
새로운 기능
기타