fix: guard against IndexError when LLM API returns empty choices list by qizwiz · Pull Request #1876 · microsoft/markitdown

qizwiz · 2026-05-14T05:17:06Z

Problem

Three places in markitdown call response.choices[0].message.content immediately after client.chat.completions.create(...) without checking whether choices is non-empty:

packages/markitdown/src/markitdown/converters/_image_converter.py:138
packages/markitdown/src/markitdown/converters/_llm_caption.py:50
packages/markitdown-ocr/src/markitdown_ocr/_ocr_service.py:102

The OpenAI API (and OpenAI-compatible providers) can return an empty choices list in three documented scenarios:

Content filtering — when the image triggers a policy violation, the API returns finish_reason: "content_filter" with an empty choices list
Streaming edge cases — SSE stream closed before any choices are emitted
OpenAI-compatible providers — local LLMs, proxies, and alternative providers may return non-standard response shapes

In all three cases, choices[0] raises IndexError: list index out of range. This crash is silent in development (dev images don't hit content filters) and surfaces in production on real user content.

Formal verification

This was found via pact static analysis and formally verified with Z3 SMT:

Bug model (SAT): content_filtered=True → choices_len=0, access_index=0 → 0≥0 → IndexError
Fix model (UNSAT): With if not response.choices guard, access_attempted ∧ choices_len=0 is a contradiction — IndexError is unreachable on all trigger paths.

Fix

# _image_converter.py and _llm_caption.py
response = client.chat.completions.create(model=model, messages=messages)
if not response.choices:
    return None
return response.choices[0].message.content

# _ocr_service.py (inline — consistent with existing `text or ""` guard below)
text = response.choices[0].message.content if response.choices else None

The _ocr_service.py path already has a bare except Exception that returns OCRResult(text="") on failure, so the None propagates safely through the existing text.strip() if text else "" guard on the next line.

Prior art

This exact crash pattern appears in multiple open issues across the LLM ecosystem: plastic-labs/honcho#676, aden-hive/hive#4767, TheR1D/shell_gpt#741, langchain-community#475.

qizwiz · 2026-05-14T05:32:38Z

@microsoft-github-policy-service agree

The OpenAI API can return an empty choices list when: - Content filtering blocks the image response - A streaming edge case closes before choices are emitted - An OpenAI-compatible provider returns a non-standard response shape In all three cases, `response.choices[0]` raises IndexError. This is a silent crash in production — content filters fire on real user images, not on dev test images, so the bug is invisible in local testing. Three affected paths: - _image_converter.py: return None when no choices (caller handles None) - _llm_caption.py: return None when no choices (caller handles None) - _ocr_service.py: inline ternary, consistent with existing `text or ""` guard already on the next line Formally verified: Z3 SMT solver proves IndexError is satisfiable under content_filtered=True → choices_len=0 (SAT), and proves the guard makes it UNSAT — no assignment produces IndexError after the check. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

qizwiz force-pushed the fix/llm-response-empty-choices-crash branch from 5719e76 to 9f80bf3 Compare May 14, 2026 14:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: guard against IndexError when LLM API returns empty choices list#1876

fix: guard against IndexError when LLM API returns empty choices list#1876
qizwiz wants to merge 1 commit into
microsoft:mainfrom
qizwiz:fix/llm-response-empty-choices-crash

qizwiz commented May 14, 2026

Uh oh!

qizwiz commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

qizwiz commented May 14, 2026

Problem

Formal verification

Fix

Prior art

Uh oh!

qizwiz commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant