From df6ab9be0ca04c84c47168785d62440c1998bf36 Mon Sep 17 00:00:00 2001
From: sjhddh <jhao.sun@gmail.com>
Date: Fri, 24 Apr 2026 10:31:30 +0200
Subject: [PATCH] docs: correct image converter description in README (EXIF +
 LLM, not OCR)

The supported-formats list claims "Images (EXIF metadata and OCR)",
but the built-in `ImageConverter` does not perform OCR. Per the
converter's own docstring, it extracts EXIF metadata (when exiftool
is available) and generates a description via a multimodal LLM when
an `llm_client` is configured. OCR is only available through the
separate Azure Document Intelligence converter (`[az-doc-intel]`
optional dependency), which is documented elsewhere in the README.

This mislabeling has caused recurring user confusion, visible in
issues #1601, #1344, #1170, and #255 where users expected OCR
to work out of the box on images and scanned PDFs.

The one-word change brings the README in line with the actual
behavior of `ImageConverter`.
---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 652afc057..8cb930891 100644
--- a/README.md
+++ b/README.md
@@ -21,7 +21,7 @@ MarkItDown currently supports the conversion from:
 - PowerPoint
 - Word
 - Excel
-- Images (EXIF metadata and OCR)
+- Images (EXIF metadata and LLM-based description)
 - Audio (EXIF metadata and speech transcription)
 - HTML
 - Text-based formats (CSV, JSON, XML)