Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR introduces GPT-5 model support with conditional UI logic, refactors API key handling in evaluations to use authentication state, adds CSV sanitization utilities, and centralizes model configuration. Changes span model management, multiple component pages, and API integration across the application. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (4)
app/lib/utils.ts (1)
4-10: Consider centralizing timestamp normalization to avoid logic drift.
timeAgonow duplicates normalization logic that also exists inapp/lib/useConfigs.ts:456-471(formatRelativeTime). This duplication already diverged (negative offset handling). A shared helper would prevent future regressions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/lib/utils.ts` around lines 4 - 10, The time normalization logic in timeAgo duplicates code from formatRelativeTime causing drift; extract a single helper (e.g., normalizeTimestamp or parseTimestampWithUTC) and replace the normalization in both timeAgo and formatRelativeTime to call that helper; ensure the helper accepts a date string, applies the same "append Z when no timezone" rule and any negative-offset handling currently used in formatRelativeTime, and return a Date object so existing calls in timeAgo and formatRelativeTime (and any other callers) can use it without duplicating logic.app/components/prompt-editor/CurrentConfigTab.tsx (1)
228-230: Consider using dynamic provider key for future extensibility.The model options are currently hardcoded to
MODEL_OPTIONS.openai. When additional providers are enabled, this will need to be updated to use theproviderprop dynamically.♻️ Proposed fix for future provider support
- {MODEL_OPTIONS.openai.map((m) => ( + {MODEL_OPTIONS[provider as keyof typeof MODEL_OPTIONS]?.map((m) => ( <option key={m.value} value={m.value}>{m.label}</option> ))}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/components/prompt-editor/CurrentConfigTab.tsx` around lines 228 - 230, The options list is hardcoded to MODEL_OPTIONS.openai which prevents switching providers; update the mapping to use the component's provider prop dynamically (e.g., use MODEL_OPTIONS[provider] or a safe getter like getModelOptions(provider)) when rendering options in CurrentConfigTab (where MODEL_OPTIONS.openai.map(...) is used) and guard against missing providers by falling back to a default provider or empty array so the <option> mapping works for future providers.app/configurations/prompt-editor/types.ts (1)
26-30: Consider makingmax_num_resultsoptional in theToolinterface for consistency.The
Toolinterface definesmax_num_resultsas required, but GPT-5 models don't support this parameter. When loading a saved GPT-5 config that omitsmax_num_results, the tools array reconstruction could fail type validation or produce unexpected behavior since the field is still expected.♻️ Proposed fix
export interface Tool { type: 'file_search'; knowledge_base_ids: string[]; - max_num_results: number; + max_num_results?: number; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/configurations/prompt-editor/types.ts` around lines 26 - 30, The Tool interface currently requires max_num_results which breaks loading GPT-5 configs that omit it; update the Tool type so max_num_results is optional (e.g. declare max_num_results?: number) and ensure any code constructing or validating Tool instances (places referencing Tool, the tools array reconstruction logic, and any validators/serializers) handles the undefined case safely by falling back to a sensible default or skipping use when undefined; update usages to treat max_num_results as optional rather than required.app/components/ConfigDrawer.tsx (1)
809-810: Consider handling undefined temperature in the Saved tab display.Line 810 displays
T:{version.temperature}for all saved configs. For GPT-5 configs where temperature is omitted, this could displayT:undefined. Consider conditionally showing this only when temperature is defined.♻️ Proposed fix
<div className="text-xs mb-2" style={{ color: colors.text.secondary }}> - {formatTimestamp(version.timestamp)} • {version.provider}/{version.modelName} • T:{version.temperature} + {formatTimestamp(version.timestamp)} • {version.provider}/{version.modelName}{version.temperature !== undefined ? ` • T:${version.temperature}` : ''} </div>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/components/ConfigDrawer.tsx` around lines 809 - 810, The Saved tab in ConfigDrawer renders "T:{version.temperature}" which can show "T:undefined" when version.temperature is absent; update the JSX that renders the line containing formatTimestamp(version.timestamp) • {version.provider}/{version.modelName} • T:{version.temperature} to conditionally include the temperature segment only when version.temperature is not null/undefined (e.g., render the "• T:…" piece with a conditional/ternary or &&), ensuring separators (dots) are adjusted so there are no dangling separators when temperature is omitted; reference ConfigDrawer, formatTimestamp, and version.temperature to locate the markup to change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/lib/utils.ts`:
- Around line 5-8: The timezone detection around the date creation in timeAgo
incorrectly only checks for "Z" or "+" on dateStr; update the condition that
builds date (the dateStr check used before new Date(dateStr) in timeAgo) to also
detect negative offsets by matching a timezone offset pattern (e.g. Z or +HH:MM
or -HH:MM) using a regex like /Z|[+\-]\d{2}(:\d{2})?$/. Replace the simple
includes(...) check with this regex test so valid strings like
"2026-03-24T09:00:49-05:00" are treated as timezone-aware and not suffixed with
"Z", preventing the Invalid time value error when constructing the Date.
---
Nitpick comments:
In `@app/components/ConfigDrawer.tsx`:
- Around line 809-810: The Saved tab in ConfigDrawer renders
"T:{version.temperature}" which can show "T:undefined" when version.temperature
is absent; update the JSX that renders the line containing
formatTimestamp(version.timestamp) • {version.provider}/{version.modelName} •
T:{version.temperature} to conditionally include the temperature segment only
when version.temperature is not null/undefined (e.g., render the "• T:…" piece
with a conditional/ternary or &&), ensuring separators (dots) are adjusted so
there are no dangling separators when temperature is omitted; reference
ConfigDrawer, formatTimestamp, and version.temperature to locate the markup to
change.
In `@app/components/prompt-editor/CurrentConfigTab.tsx`:
- Around line 228-230: The options list is hardcoded to MODEL_OPTIONS.openai
which prevents switching providers; update the mapping to use the component's
provider prop dynamically (e.g., use MODEL_OPTIONS[provider] or a safe getter
like getModelOptions(provider)) when rendering options in CurrentConfigTab
(where MODEL_OPTIONS.openai.map(...) is used) and guard against missing
providers by falling back to a default provider or empty array so the <option>
mapping works for future providers.
In `@app/configurations/prompt-editor/types.ts`:
- Around line 26-30: The Tool interface currently requires max_num_results which
breaks loading GPT-5 configs that omit it; update the Tool type so
max_num_results is optional (e.g. declare max_num_results?: number) and ensure
any code constructing or validating Tool instances (places referencing Tool, the
tools array reconstruction logic, and any validators/serializers) handles the
undefined case safely by falling back to a sensible default or skipping use when
undefined; update usages to treat max_num_results as optional rather than
required.
In `@app/lib/utils.ts`:
- Around line 4-10: The time normalization logic in timeAgo duplicates code from
formatRelativeTime causing drift; extract a single helper (e.g.,
normalizeTimestamp or parseTimestampWithUTC) and replace the normalization in
both timeAgo and formatRelativeTime to call that helper; ensure the helper
accepts a date string, applies the same "append Z when no timezone" rule and any
negative-offset handling currently used in formatRelativeTime, and return a Date
object so existing calls in timeAgo and formatRelativeTime (and any other
callers) can use it without duplicating logic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ea765e1a-3fe8-47d2-82a0-46a69693c567
📒 Files selected for processing (7)
app/components/ConfigDrawer.tsxapp/components/prompt-editor/ConfigEditorPane.tsxapp/components/prompt-editor/CurrentConfigTab.tsxapp/configurations/prompt-editor/page.tsxapp/configurations/prompt-editor/types.tsapp/lib/models.tsapp/lib/utils.ts
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
app/components/evaluations/EvaluationsTab.tsx (1)
66-125:⚠️ Potential issue | 🟠 MajorScope evaluation state to the current API key.
evalJobsandassistantConfigssurvive API-key changes, so this component can render runs/configs fetched under the previous credential until the next requests finish. Because the config cache is keyed only byassistant_id,assistantConfigs.has(job.assistant_id)can also suppress a refetch and keep showing a config loaded under the wrong key indefinitely. Clear both states whenactiveKey.keychanges and ignore or abort older requests before they commit state.Possible direction
+ useEffect(() => { + setEvalJobs([]); + setAssistantConfigs(new Map()); + setError(null); + }, [activeKey?.key]);Then make both fetch helpers ignore aborted/stale responses so an older key cannot repopulate the cleared state.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/components/evaluations/EvaluationsTab.tsx` around lines 66 - 125, When the API key changes, clear evalJobs and assistantConfigs and ensure in-flight fetches cannot repopulate cleared state: update the activeKey effect to reset setEvalJobs([]) and setAssistantConfigs(new Map()) whenever activeKey?.key changes, and modify fetchEvaluations and fetchAssistantConfig to accept/use an AbortController or capture the current activeKey value so that responses are ignored if the request is aborted or the key has changed; specifically, add abort handling in fetchEvaluations and fetchAssistantConfig (or check that activeKey.key matches the key used to start the fetch) and bail out before calling setEvalJobs or setAssistantConfigs to avoid stale-data commits.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/`(routes)/configurations/prompt-editor/page.tsx:
- Around line 297-302: The params payload may include temperature: undefined
when switching a GPT-5-loaded config back to a non-GPT-5 model; update the
branch that builds params (the object containing model, instructions:
currentContent, ...(!gpt5 && { temperature, })) to normalize temperature by
using currentConfigBlob.completion.params.temperature ?? 0.7 so it always sends
a concrete default for non-GPT-5 models; locate the params construction in
page.tsx (referencing variables model, currentContent, gpt5, currentConfigBlob)
and replace the temperature expression with the nullish-coalescing default.
- Around line 89-95: The UI is using stableVersionItemsMap for versionItemsMap
which hides live cache misses; change the binding so the live
hookVersionItemsMap is used (or takes precedence) for cache-miss detection while
still falling back to stableVersionItemsMap for stability. Replace the current
const versionItemsMap = stableVersionItemsMap with logic that uses
hookVersionItemsMap when it has entries (e.g. use hookVersionItemsMap if
Object.keys(hookVersionItemsMap).length > 0, or merge like {
...stableVersionItemsMap, ...hookVersionItemsMap } with hookVersionItemsMap
taking precedence) so the refill effect that reads configState.versionItemsCache
sees live misses and refetches correctly.
In `@app/`(routes)/evaluations/page.tsx:
- Around line 75-96: When activeKey changes you should immediately clear dataset
state and protect loadStoredDatasets against stale responses: in the effect that
watches activeKey (and before calling loadStoredDatasets) call
setStoredDatasets([]) and setSelectedDatasetId(undefined) to remove previous key
data, and inside loadStoredDatasets implement a request-id or AbortController
guard (e.g., create a local token/requestId captured by the async call and
verify it before calling setStoredDatasets) so an older in-flight response
cannot overwrite the fresh results for the current activeKey; keep references to
activeKey.key, loadStoredDatasets, storedDatasets, and selectedDatasetId to
locate where to add these changes.
In `@app/components/evaluations/EvaluationsTab.tsx`:
- Around line 123-125: The current useEffect only calls fetchEvaluations once
per key; change it to poll every 10 seconds while a key is active: inside the
useEffect that references activeKey and fetchEvaluations, call fetchEvaluations
immediately if activeKey?.key exists, then start a setInterval that calls
fetchEvaluations every 10000ms; return a cleanup function that clears the
interval to avoid leaks. Ensure the effect depends on activeKey.key (or
activeKey) and fetchEvaluations so polling restarts when the key changes.
---
Outside diff comments:
In `@app/components/evaluations/EvaluationsTab.tsx`:
- Around line 66-125: When the API key changes, clear evalJobs and
assistantConfigs and ensure in-flight fetches cannot repopulate cleared state:
update the activeKey effect to reset setEvalJobs([]) and setAssistantConfigs(new
Map()) whenever activeKey?.key changes, and modify fetchEvaluations and
fetchAssistantConfig to accept/use an AbortController or capture the current
activeKey value so that responses are ignored if the request is aborted or the
key has changed; specifically, add abort handling in fetchEvaluations and
fetchAssistantConfig (or check that activeKey.key matches the key used to start
the fetch) and bail out before calling setEvalJobs or setAssistantConfigs to
avoid stale-data commits.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9930a1dd-bec6-496d-949b-b225ee17d108
📒 Files selected for processing (13)
app/(routes)/configurations/prompt-editor/page.tsxapp/(routes)/configurations/prompt-editor/types.tsapp/(routes)/evaluations/page.tsxapp/components/ConfigDrawer.tsxapp/components/ConfigSelector.tsxapp/components/evaluations/DatasetsTab.tsxapp/components/evaluations/EvaluationsTab.tsxapp/components/prompt-editor/ConfigEditorPane.tsxapp/components/prompt-editor/CurrentConfigTab.tsxapp/hooks/useConfigs.tsapp/lib/constants.tsapp/lib/models.tsapp/lib/utils.ts
✅ Files skipped from review due to trivial changes (4)
- app/(routes)/configurations/prompt-editor/types.ts
- app/hooks/useConfigs.ts
- app/lib/models.ts
- app/components/prompt-editor/CurrentConfigTab.tsx
🚧 Files skipped from review as they are similar to previous changes (4)
- app/lib/utils.ts
- app/components/prompt-editor/ConfigEditorPane.tsx
- app/components/ConfigDrawer.tsx
- app/components/ConfigSelector.tsx
…ntend into feat/support-new-models-configuration
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
app/components/evaluations/EvaluationsTab.tsx (1)
66-88:⚠️ Potential issue | 🟠 MajorClear and guard key-scoped evaluation state on API-key changes.
Both fetch paths still update component state unconditionally, so switching
activeKeycan briefly show the previous key’s runs/configs, and an older in-flight response can overwrite the new key’s state. ResetevalJobs/assistantConfigswhen the key changes and ignore stale responses before callingsetEvalJobsorsetAssistantConfigs.Also applies to: 91-113
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/components/evaluations/EvaluationsTab.tsx` around lines 66 - 88, Reset the key-scoped state and guard against stale in-flight responses: when activeKey changes, immediately clear evalJobs and assistantConfigs (call setEvalJobs([]) and setAssistantConfigs([])) and use a per-request token (e.g., incrementing requestId local to fetchEvaluations and the similar fetch in the 91-113 block or an AbortController) captured in the async closure; before calling setEvalJobs or setAssistantConfigs inside fetchEvaluations (and the other fetch function), verify the token matches the latest activeKey token (or that the request wasn’t aborted) to ignore stale responses and prevent older results from overwriting state.app/(main)/evaluations/page.tsx (1)
205-221:⚠️ Potential issue | 🟠 MajorReturn
falsefrom the validation branches to match the prop contract.
EvaluationsTabexpectshandleRunEvaluation: () => Promise<boolean>, but the barereturn;statements at lines 208, 212, 216, and 220 widen this toPromise<boolean | undefined>. This violates the contract and leaves the caller with an undeclared third state.Fix
const handleRunEvaluation = async () => { if (!activeKey?.key) { toast.error("Please select an API key first"); - return; + return false; } if (!selectedDatasetId) { toast.error("Please select a dataset first"); - return; + return false; } if (!experimentName.trim()) { toast.error("Please enter an evaluation name"); - return; + return false; } if (!selectedConfigId || !selectedConfigVersion) { toast.error("Please select a configuration before running evaluation"); - return; + return false; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/`(main)/evaluations/page.tsx around lines 205 - 221, The validation branches inside the async function handleRunEvaluation currently use bare returns which produce Promise<void|boolean>; update each early-exit validation (checks for activeKey, selectedDatasetId, experimentName, selectedConfigId/selectedConfigVersion) to explicitly return false so the signature matches EvaluationsTab's handleRunEvaluation: () => Promise<boolean>; also ensure the successful path returns true (or resolves to true) at the end of handleRunEvaluation so callers always receive a boolean.
♻️ Duplicate comments (3)
app/(main)/configurations/prompt-editor/page.tsx (1)
288-290:⚠️ Potential issue | 🟠 MajorDefault non-GPT-5 temperature before save.
On Line 289, the non-GPT-5 branch can still emit
temperature: undefinedif a GPT-5-loaded config is switched to a non-GPT-5 model and saved without touching the slider. Please normalize with?? 0.7.Suggested fix
...(!gpt5 && { - temperature: currentConfigBlob.completion.params.temperature, + temperature: currentConfigBlob.completion.params.temperature ?? 0.7, }),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/`(main)/configurations/prompt-editor/page.tsx around lines 288 - 290, The non-GPT-5 branch can emit temperature: undefined when switching from a GPT-5 config; update the spread that adds temperature for non-GPT-5 models so it uses a normalized default (e.g., currentConfigBlob.completion.params.temperature ?? 0.7) instead of the raw value. Modify the expression where gpt5 is checked and temperature is added (the ...(!gpt5 && { temperature: currentConfigBlob.completion.params.temperature, }) spot) to apply the nullish-coalescing default so saved configs always have a concrete temperature.app/components/evaluations/EvaluationsTab.tsx (1)
123-125:⚠️ Potential issue | 🟠 MajorRestore the 10-second status polling.
This effect only fetches once per key, so runs that move from
pending/processingtocompletedstay stale until the user manually refreshes.Based on learnings: Implement polling every 10 seconds for job status updates in evaluation workflows instead of using WebSockets or server-sent events.Possible direction
useEffect(() => { - if (activeKey?.key) fetchEvaluations(); + if (!activeKey?.key) return; + + fetchEvaluations(); + const intervalId = window.setInterval(fetchEvaluations, 10_000); + + return () => { + window.clearInterval(intervalId); + }; }, [activeKey, fetchEvaluations]);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/components/evaluations/EvaluationsTab.tsx` around lines 123 - 125, The current useEffect with dependency [activeKey, fetchEvaluations] only calls fetchEvaluations once per key; change it to start a 10-second polling loop when an activeKey.key exists by creating an interval (setInterval) that calls fetchEvaluations every 10_000 ms and ensure you clear the interval on cleanup (clearInterval) and when activeKey changes; implement this inside the same useEffect that references activeKey and fetchEvaluations so the interval starts for the active key and is disposed via the effect’s cleanup to avoid duplicate timers.app/(main)/evaluations/page.tsx (1)
76-97:⚠️ Potential issue | 🟠 MajorReset dataset state before refetching for a new API key.
This still leaves the previous key’s datasets and selection visible until the new request resolves, and an older response can overwrite the current list entirely. Clear
storedDatasets/selectedDatasetIdon key changes and guardloadStoredDatasetsagainst stale responses before callingsetStoredDatasets.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/`(main)/evaluations/page.tsx around lines 76 - 97, When the API key changes, clear the previous dataset UI state and prevent stale responses from overwriting it: in the useEffect that watches activeKey, call setStoredDatasets([]) and setSelectedDatasetId(undefined) (or null) immediately before invoking loadStoredDatasets, and modify loadStoredDatasets to track request identity (e.g., a local requestId or AbortController) tied to the current activeKey so that only the latest response calls setStoredDatasets; also ensure loadStoredDatasets checks activeKey still matches before updating state and still toggles setIsDatasetsLoading correctly.
🧹 Nitpick comments (1)
app/(main)/evaluations/[id]/page.tsx (1)
169-171: Consider usingsanitizeCSVCellconsistently for all user-controlled content.While
sanitizeCSVCellis now imported and used for some fields, several user-controlled values still use inline escaping (LLM answers at line 170, and inexportRowCSVat lines 229-231). For consistency and to benefit from formula injection prevention on untrusted content, consider migrating these as well.Example for line 170
- row.push( - `"${(group.llm_answers[i] || "").replace(/"/g, '""').replace(/\n/g, " ")}"`, - ); + row.push(sanitizeCSVCell(group.llm_answers[i] || "", true));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/`(main)/evaluations/[id]/page.tsx around lines 169 - 171, Replace inline CSV escaping of user-controlled strings with the shared sanitizeCSVCell utility: where group.llm_answers[i] is pushed (currently using .replace to escape quotes/newlines) and in the exportRowCSV logic (the spots mentioned around exportRowCSV and the lines that manually replace quotes/newlines), call sanitizeCSVCell(group.llm_answers[i] || "") and sanitizeCSVCell for the other user-controlled fields instead of manual .replace chains; ensure you pass the raw value to sanitizeCSVCell so it can handle quote/newline escaping and formula-injection protection consistently across the codebase.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/`(main)/evaluations/[id]/page.tsx:
- Around line 384-387: Define a shared constant TERMINAL_JOB_STATUSES =
["completed","success","failed","error"] in app/lib/constants.ts, then replace
the ad-hoc progress checks (the isJobInProgress boolean expressions) in
page.tsx, EvalRunCard.tsx and the other pages with: const isJobInProgress =
!TERMINAL_JOB_STATUSES.includes(job.status.toLowerCase()); also update any other
places relying on getStatusColor() to use the same TERMINAL_JOB_STATUSES to keep
terminal-state logic consistent across getStatusColor, isJobInProgress, and
warning displays.
In `@app/`(main)/evaluations/page.tsx:
- Around line 41-42: The page currently reads activeKey from useAuth() and
renders the "API key required" empty state before the auth context hydrates;
update the component to also read isHydrated from useAuth() and short-circuit to
a loader/placeholder (or keep existing mounted/loading state) while isHydrated
is false so the empty-key UI only appears after hydration completes; apply this
change wherever activeKey is checked (e.g., the checks around lines referencing
activeKey at the top and the other branches you noted) so that checks use both
isHydrated and activeKey (render loader if !isHydrated, render missing-key UI
only if isHydrated && !activeKey).
---
Outside diff comments:
In `@app/`(main)/evaluations/page.tsx:
- Around line 205-221: The validation branches inside the async function
handleRunEvaluation currently use bare returns which produce
Promise<void|boolean>; update each early-exit validation (checks for activeKey,
selectedDatasetId, experimentName, selectedConfigId/selectedConfigVersion) to
explicitly return false so the signature matches EvaluationsTab's
handleRunEvaluation: () => Promise<boolean>; also ensure the successful path
returns true (or resolves to true) at the end of handleRunEvaluation so callers
always receive a boolean.
In `@app/components/evaluations/EvaluationsTab.tsx`:
- Around line 66-88: Reset the key-scoped state and guard against stale
in-flight responses: when activeKey changes, immediately clear evalJobs and
assistantConfigs (call setEvalJobs([]) and setAssistantConfigs([])) and use a
per-request token (e.g., incrementing requestId local to fetchEvaluations and
the similar fetch in the 91-113 block or an AbortController) captured in the
async closure; before calling setEvalJobs or setAssistantConfigs inside
fetchEvaluations (and the other fetch function), verify the token matches the
latest activeKey token (or that the request wasn’t aborted) to ignore stale
responses and prevent older results from overwriting state.
---
Duplicate comments:
In `@app/`(main)/configurations/prompt-editor/page.tsx:
- Around line 288-290: The non-GPT-5 branch can emit temperature: undefined when
switching from a GPT-5 config; update the spread that adds temperature for
non-GPT-5 models so it uses a normalized default (e.g.,
currentConfigBlob.completion.params.temperature ?? 0.7) instead of the raw
value. Modify the expression where gpt5 is checked and temperature is added (the
...(!gpt5 && { temperature: currentConfigBlob.completion.params.temperature, })
spot) to apply the nullish-coalescing default so saved configs always have a
concrete temperature.
In `@app/`(main)/evaluations/page.tsx:
- Around line 76-97: When the API key changes, clear the previous dataset UI
state and prevent stale responses from overwriting it: in the useEffect that
watches activeKey, call setStoredDatasets([]) and
setSelectedDatasetId(undefined) (or null) immediately before invoking
loadStoredDatasets, and modify loadStoredDatasets to track request identity
(e.g., a local requestId or AbortController) tied to the current activeKey so
that only the latest response calls setStoredDatasets; also ensure
loadStoredDatasets checks activeKey still matches before updating state and
still toggles setIsDatasetsLoading correctly.
In `@app/components/evaluations/EvaluationsTab.tsx`:
- Around line 123-125: The current useEffect with dependency [activeKey,
fetchEvaluations] only calls fetchEvaluations once per key; change it to start a
10-second polling loop when an activeKey.key exists by creating an interval
(setInterval) that calls fetchEvaluations every 10_000 ms and ensure you clear
the interval on cleanup (clearInterval) and when activeKey changes; implement
this inside the same useEffect that references activeKey and fetchEvaluations so
the interval starts for the active key and is disposed via the effect’s cleanup
to avoid duplicate timers.
---
Nitpick comments:
In `@app/`(main)/evaluations/[id]/page.tsx:
- Around line 169-171: Replace inline CSV escaping of user-controlled strings
with the shared sanitizeCSVCell utility: where group.llm_answers[i] is pushed
(currently using .replace to escape quotes/newlines) and in the exportRowCSV
logic (the spots mentioned around exportRowCSV and the lines that manually
replace quotes/newlines), call sanitizeCSVCell(group.llm_answers[i] || "") and
sanitizeCSVCell for the other user-controlled fields instead of manual .replace
chains; ensure you pass the raw value to sanitizeCSVCell so it can handle
quote/newline escaping and formula-injection protection consistently across the
codebase.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9b40d878-b450-4db7-b213-5466538d34f9
📒 Files selected for processing (10)
app/(main)/configurations/prompt-editor/page.tsxapp/(main)/evaluations/[id]/page.tsxapp/(main)/evaluations/page.tsxapp/components/ConfigDrawer.tsxapp/components/evaluations/DatasetsTab.tsxapp/components/evaluations/EvaluationsTab.tsxapp/components/prompt-editor/ConfigEditorPane.tsxapp/components/prompt-editor/CurrentConfigTab.tsxapp/lib/constants.tsapp/lib/utils.ts
✅ Files skipped from review due to trivial changes (1)
- app/components/prompt-editor/CurrentConfigTab.tsx
🚧 Files skipped from review as they are similar to previous changes (4)
- app/lib/constants.ts
- app/components/ConfigDrawer.tsx
- app/lib/utils.ts
- app/components/evaluations/DatasetsTab.tsx
Issue: #88
Summary:
GPT-4.1 to the model selection dropdown
temperatureandmax_num_resultsfrom the API payload for GPT-5 modelsMODEL_OPTIONSinto a single shared file (was duplicated in 3 files)Summary by CodeRabbit
New Features
Bug Fixes
Improvements