feat: audio transcription API in LLM gateway #44017

a-lider · 2025-12-26T19:30:49Z

Problem

To deploy the PostHog mobile app to TestFlight (PR: PostHog/Array#224), audio transcription needs to be moved from the client to the backend.

For context: example of the OpenAI Transcription API response:

Transcription(
    text="Audio recording...",
    usage=UsageTokens(
        input_tokens=163,           # Total input (audio + text prompt)
        output_tokens=26,           # Transcribed text tokens
        total_tokens=189,
        input_token_details=UsageTokensInputTokenDetails(
            audio_tokens=163,       # Audio portion
            text_tokens=0           # Text prompt portion
        )
    )
)

Docs: https://platform.openai.com/docs/guides/speech-to-text

Changes

Added /v1/audio/transcriptions endpoint to LLM Gateway
Token-based billing via LiteLLM posthog callback
Added manual cost definitions for OpenAI transcription models in manual-providers.ts
- OpenRouter models costs data doesn't include speech-to-text models (e.g. gpt-4o-transcribe, whisper-1)

How did you test this code?

Manually tested via mobile app voice recording

Screen.Recording.2025-12-26.at.20.29.41.mov

cubic-dev-ai

No issues found across 3 files

greptile-apps

_{3 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2025-12-26T19:33:46Z

posthog/api/llm_gateway/http.py

+                "file": file_tuple,
+                # Use JSON to collect input/output tokens for billing
+                # Other formats are not supported yet ("text", "srt", "verbose_json", "vtt")
+                "response_format": "json",


style: hardcoded response format to json limits flexibility. the openai transcription api supports multiple formats (text, srt, verbose_json, vtt) that users might need

Suggested change

"response_format": "json",

"response_format": data.get("response_format", "json"),

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

Prompt To Fix With AI

This is a comment left during a code review. Path: posthog/api/llm_gateway/http.py Line: 410:410 Comment: **style:** hardcoded response format to `json` limits flexibility. the openai transcription api supports multiple formats (`text`, `srt`, `verbose_json`, `vtt`) that users might need ```suggestion "response_format": data.get("response_format", "json"), ``` <sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub> How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2025-12-26T19:33:47Z

posthog/api/llm_gateway/serializers.py

+class TranscriptionRequestSerializer(serializers.Serializer):
+    model = serializers.ChoiceField(
+        choices=["gpt-4o-transcribe", "gpt-4o-mini-transcribe", "whisper-1"],
+        default="gpt-4o-transcribe",
+        help_text="Transcription model",
+    )
+    prompt = serializers.CharField(
+        required=False,
+        help_text="Optional text prompt to guide the style, vocabulary or continue a previous audio segment",
+    )
+    language = serializers.CharField(
+        required=False,
+        help_text="Language of the input audio in ISO-639-1 format. See https://github.com/openai/whisper#available-models-and-languages",
+    )
+    temperature = serializers.FloatField(
+        required=False,
+        min_value=0.0,
+        max_value=1.0,
+        help_text="Optional temperature between 0 and 1",
+    )


style: missing file field validation in serializer. the audio_file validation happens in the view (line 392-397 in http.py), but the serializer should validate file uploads

Suggested change

class TranscriptionRequestSerializer(serializers.Serializer):

model = serializers.ChoiceField(

choices=["gpt-4o-transcribe", "gpt-4o-mini-transcribe", "whisper-1"],

default="gpt-4o-transcribe",

help_text="Transcription model",

)

prompt = serializers.CharField(

required=False,

help_text="Optional text prompt to guide the style, vocabulary or continue a previous audio segment",

)

language = serializers.CharField(

required=False,

help_text="Language of the input audio in ISO-639-1 format. See https://github.com/openai/whisper#available-models-and-languages",

)

temperature = serializers.FloatField(

required=False,

min_value=0.0,

max_value=1.0,

help_text="Optional temperature between 0 and 1",

)

class TranscriptionRequestSerializer(serializers.Serializer):

file = serializers.FileField(

required=True,

help_text="Audio file to transcribe (flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm)",

)

model = serializers.ChoiceField(

choices=["gpt-4o-transcribe", "gpt-4o-mini-transcribe", "whisper-1"],

default="gpt-4o-transcribe",

help_text="Transcription model",

)

prompt = serializers.CharField(

required=False,

help_text="Optional text prompt to guide the style, vocabulary or continue a previous audio segment",

)

language = serializers.CharField(

required=False,

help_text="Language of the input audio in ISO-639-1 format. See https://github.com/openai/whisper#available-models-and-languages",

)

temperature = serializers.FloatField(

required=False,

min_value=0.0,

max_value=1.0,

help_text="Optional temperature between 0 and 1",

)

does the MultiPartParser already handle file validation before the serializer, making this suggestion unnecessary?

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

Prompt To Fix With AI

This is a comment left during a code review. Path: posthog/api/llm_gateway/serializers.py Line: 167:186 Comment: **style:** missing `file` field validation in serializer. the `audio_file` validation happens in the view (line 392-397 in http.py), but the serializer should validate file uploads ```suggestion class TranscriptionRequestSerializer(serializers.Serializer): file = serializers.FileField( required=True, help_text="Audio file to transcribe (flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm)", ) model = serializers.ChoiceField( choices=["gpt-4o-transcribe", "gpt-4o-mini-transcribe", "whisper-1"], default="gpt-4o-transcribe", help_text="Transcription model", ) prompt = serializers.CharField( required=False, help_text="Optional text prompt to guide the style, vocabulary or continue a previous audio segment", ) language = serializers.CharField( required=False, help_text="Language of the input audio in ISO-639-1 format. See https://github.com/openai/whisper#available-models-and-languages", ) temperature = serializers.FloatField( required=False, min_value=0.0, max_value=1.0, help_text="Optional temperature between 0 and 1", ) ``` does the MultiPartParser already handle file validation before the serializer, making this suggestion unnecessary? <sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub> How can I resolve this? If you propose a fix, please make it concise.

github-actions · 2025-12-26T19:50:49Z

🦔 Preview instance

✅ Preview deployment ready

🌐 Access the instance

URL: https://do-ci-hobby-pr-44017.posthog.cc

SSH: ssh [email protected]

IP: 24.199.102.98

Mode: 🔄 Preview (persistent)
Commit: 26abd64
Workflow run: #6678

Full instance details

Droplet ID: 539915241
Droplet IP: 24.199.102.98
SSH: ssh [email protected]
URL: https://do-ci-hobby-pr-44017.posthog.cc

Deployment output

Downloading pynacl (1.3MiB)
Downloading cryptography (4.3MiB)
 Downloaded pynacl
 Downloaded cryptography
Installed 14 packages in 7ms
🔄 Preview mode enabled - checking for existing droplet for PR #44017
✅ Found existing droplet: do-ci-hobby-pr-44017 (ID: 539915241)
  IP: 24.199.102.98
  Updating to SHA: 26abd64
🔄 Updating existing deployment to SHA: 26abd64fab3a97620692e2fb236836d4c4fd51c8
✅ Updated POSTHOG_APP_TAG to 26abd64fab3a97620692e2fb236836d4c4fd51c8
🐋 Pulling new Docker images...
✅ Images pulled successfully
🔄 Restarting services...
✅ Services restarted
⏳ Waiting for services to stabilize...
✅ Deployment updated successfully
✅ Preview deployment updated successfully
🌐 URL: https://do-ci-hobby-pr-44017.posthog.cc

… audio

cubic-dev-ai

1 issue found across 2 files (changes from recent commits).

Prompt for AI agents (all issues)


Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="posthog/api/llm_gateway/serializers.py">

<violation number="1" location="posthog/api/llm_gateway/serializers.py:168">
P2: Consider adding a file size validator to prevent large file uploads that could cause memory issues. OpenAI&#39;s transcription API limits files to 25MB, so enforcing this at the serializer level is recommended.</violation>
</file>

_{Reply to cubic to teach it or ask questions. Tag @cubic-dev-ai to re-run a review.}

cubic-dev-ai · 2025-12-26T22:55:23Z

posthog/api/llm_gateway/serializers.py

+
+
+class TranscriptionRequestSerializer(serializers.Serializer):
+    file = serializers.FileField(


P2: Consider adding a file size validator to prevent large file uploads that could cause memory issues. OpenAI's transcription API limits files to 25MB, so enforcing this at the serializer level is recommended.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At posthog/api/llm_gateway/serializers.py, line 168: <comment>Consider adding a file size validator to prevent large file uploads that could cause memory issues. OpenAI's transcription API limits files to 25MB, so enforcing this at the serializer level is recommended.</comment> <file context> @@ -165,6 +165,10 @@ class ErrorResponseSerializer(serializers.Serializer): class TranscriptionRequestSerializer(serializers.Serializer): + file = serializers.FileField( + required=True, + help_text="Audio file to transcribe (flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm)", </file context>

a-lider added 3 commits December 25, 2025 21:10

Transcribe API frst version

77b2c99

Add audio transcription to LLM gateway

04f807c

Remove separate transcription viewset

4e485aa

assign-reviewers-posthog bot requested review from a team December 26, 2025 19:31

a-lider requested a review from a team December 26, 2025 19:31

cubic-dev-ai bot reviewed Dec 26, 2025

View reviewed changes

greptile-apps bot reviewed Dec 26, 2025

View reviewed changes

a-lider removed the request for review from a team December 26, 2025 19:34

a-lider added the hobby-preview Keep hobby deployment droplet alive for preview label Dec 26, 2025

a-lider added 3 commits December 26, 2025 21:34

Update audio cost calculation and tests

47ebb10

Remove prompt param from transcription API, treat all input tokens as…

f57511e

… audio

Validate file in serializer

e27d87c

cubic-dev-ai bot reviewed Dec 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: audio transcription API in LLM gateway #44017

feat: audio transcription API in LLM gateway #44017

a-lider commented Dec 26, 2025 •

edited

Loading

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Dec 26, 2025

Uh oh!

greptile-apps bot Dec 26, 2025

Uh oh!

github-actions bot commented Dec 26, 2025 •

edited

Loading

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

cubic-dev-ai bot Dec 26, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	"response_format": "json",
	"response_format": data.get("response_format", "json"),



		class TranscriptionRequestSerializer(serializers.Serializer):
		file = serializers.FileField(

feat: audio transcription API in LLM gateway #44017

Are you sure you want to change the base?

feat: audio transcription API in LLM gateway #44017

Conversation

a-lider commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Changes

How did you test this code?

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦔 Preview instance

🌐 Access the instance

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

a-lider commented Dec 26, 2025 •

edited

Loading

github-actions bot commented Dec 26, 2025 •

edited

Loading

cubic-dev-ai bot Dec 26, 2025 •

edited

Loading