This repository contains sample projects demonstrating how to work with Zoom's Realtime Media Streams (RTMS) in JavaScript, Python, and SDK implementations.
Zoom Realtime Media Streams (RTMS) allows developers to access realtime media data from Zoom meetings, including:
- Audio streams - Raw PCM audio (L16, 16kHz/24kHz)
- Video streams - H.264 encoded video
- Transcripts - Real-time speech-to-text
- Screen shares - JPEG/PNG/H.264 frames
- Chat messages - In-meeting chat
import { RTMSManager } from './library/javascript/rtmsManager/RTMSManager.js';
import WebhookManager from './library/javascript/webhookManager/WebhookManager.js';
import express from 'express';
const app = express();
// 1. Configure RTMS
await RTMSManager.init({
credentials: {
meeting: {
clientId: process.env.ZOOM_CLIENT_ID,
clientSecret: process.env.ZOOM_CLIENT_SECRET,
zoomSecretToken: process.env.ZOOM_SECRET_TOKEN,
}
},
mediaParams: {
audio: { codec: 'L16', sampleRate: 16000 },
transcript: { language: 'en' }
}
});
// 2. Setup webhook to receive Zoom events
const webhookManager = new WebhookManager({
config: { webhookPath: '/', zoomSecretToken: process.env.ZOOM_SECRET_TOKEN },
app
});
webhookManager.on('event', (event, payload) => RTMSManager.handleEvent(event, payload));
webhookManager.setup();
// 3. Handle real-time media
RTMSManager.on('audio', ({ buffer, userId, userName, timestamp }) => {
console.log(`Audio from ${userName}: ${buffer.length} bytes`);
});
RTMSManager.on('transcript', ({ text, userName }) => {
console.log(`${userName}: ${text}`);
});
// 4. Start
await RTMSManager.start();
app.listen(3000);zoom_apps/ai_industry_specific_notetaker_js/
Build a meeting assistant that extracts entities, detects action items, classifies topics, and generates summaries in real-time.
Zoom Meeting → RTMS Webhook → WebSocket → Transcript → NLP Pipeline → Frontend
RTMSManager.on('transcript', async ({ text, userName }) => {
// Named Entity Recognition
const entities = await detectEntities(text);
// Action item detection (regex-based)
const actions = detectActionItems(text);
// Topic classification via LLM
const topic = await classifyTopic(text);
// Periodic summarization
if (transcriptHistory.length % 5 === 0) {
summary = await summarize(transcriptHistory.join(' '));
}
// Broadcast to connected frontends
frontendWss.broadcast({ text, entities, actions, topic, summary, user: userName });
});Features:
- Real-time entity extraction (people, organizations, dates)
- Action item detection ("we need to", "let's", "follow up")
- Topic classification (Finance, Legal, Tech, HR)
- Rolling meeting summaries
- WebSocket broadcast to frontend dashboard
| Sample | Description |
|---|---|
ai_industry_specific_notetaker_js |
NLP pipeline: NER, action items, topics, summaries |
ai_transcript_analysis_js |
Real-time transcript analysis |
ai_rag_customer_support_js |
Customer service AI with RAG |
ai_chat_with_audio_playback_js |
LLM chatbot with neural audio playback |
ai_dnd_game_js |
D&D game powered by transcripts |
| Sample | Description |
|---|---|
send_audio_to_deepgram_transcribe_service_js |
Deepgram real-time transcription |
send_audio_to_assemblyai_transcribe_service_js |
AssemblyAI transcription |
send_audio_to_aws_transcribe_service_js |
AWS Transcribe integration |
send_audio_to_azure_speech_to_text_service_js |
Azure Speech-to-Text |
| Sample | Description |
|---|---|
save_audio_and_video_to_aws_s3_storage_js |
Save recordings to AWS S3 |
save_audio_and_video_to_azure_blob_storage_js |
Save recordings to Azure Blob |
save_audio_and_video_to_local_storage_js |
Save recordings locally |
| Sample | Strategy | Description |
|---|---|---|
stream_to_aws_ivs_gap_filler_js |
Gap Filler | Stream to AWS IVS with mute detection |
stream_to_aws_ivs_jitter_buffer_js |
Jitter Buffer | Stream to AWS IVS with packet reordering |
stream_audio_and_video_to_youtube_greedy_gap_filler_js |
Greedy Gap Filler | Stream to YouTube Live |
stream_audio_and_video_to_custom_frontend_passthru_js |
Passthru | Stream to custom HLS frontend |
.
├── audio/ # Audio processing & transcription samples
│ ├── send_audio_to_assemblyai_transcribe_service_js/
│ ├── send_audio_to_assemblyai_transcribe_service_sdk/
│ ├── send_audio_to_aws_transcribe_service_js/
│ ├── send_audio_to_aws_transcribe_service_sdk/
│ ├── send_audio_to_azure_speech_to_text_service_js/
│ ├── send_audio_to_azure_speech_to_text_service_sdk/
│ ├── send_audio_to_deepgram_transcribe_service_js/
│ └── send_audio_to_deepgram_transcribe_service_sdk/
├── boilerplate/ # Starter templates for various languages
│ ├── working_cplusplus_wss/
│ ├── working_dotnetcore/
│ ├── working_go/
│ ├── working_js/
│ ├── working_python/
│ ├── working_python_wss/
│ └── working_sdk/
├── library/ # Shared JavaScript library (RTMSManager)
│ └── javascript/
│ ├── rtmsManager/ # Core RTMS connection management
│ ├── webhookManager/ # Zoom webhook handling
│ ├── webSocketManager/ # Zoom WebSocket event handling
│ └── commonHelpers/ # Audio/video processing utilities
├── rtms_api/ # Manual RTMS start/stop control
│ ├── manual_start_stop_using_js/
│ └── manual_start_stop_using_python/
├── rtms_mcp_client/ # Model Context Protocol integration
│ └── zoom-rtms-mcp-client/
├── screen_share/ # Screen share capture samples
│ ├── save_screen_share_js/
│ └── save_screen_share_pdf_js/
├── storage/ # Recording & cloud storage samples
│ ├── save_audio_and_video_to_aws_s3_storage_js/
│ ├── save_audio_and_video_to_aws_s3_storage_sdk/
│ ├── save_audio_and_video_to_azure_blob_storage_js/
│ ├── save_audio_and_video_to_azure_blob_storage_sdk/
│ ├── save_audio_and_video_to_local_storage_js/
│ └── save_audio_and_video_to_local_storage_sdk/
├── streaming/ # Live streaming samples
│ ├── stream_audio_and_video_to_custom_frontend_passthru_js/
│ ├── stream_audio_and_video_to_youtube_greedy_gap_filler_js/
│ ├── stream_to_aws_ivs_gap_filler_js/
│ ├── stream_to_aws_ivs_jitter_buffer_js/
│ └── stream_to_aws_kinesis_passthru_js/
├── transcript/ # Transcript processing samples
│ ├── save_transcript_js/
│ ├── save_transcript_sdk/
│ ├── send_transcript_to_claude_js/
│ ├── send_transcript_to_openai_js/
│ └── send_transcript_to_openrouter_js/
├── video/ # Video analysis samples
│ ├── detect_emotion_using_amazon_rekognition_js/
│ └── detect_object_using_tensorflow_js/
└── zoom_apps/ # Complete Zoom App examples
├── ai_chat_with_audio_playback_js/
├── ai_dnd_game_js/
├── ai_industry_specific_notetaker_js/
├── ai_rag_customer_support_js/
├── ai_transcript_analysis_js/
├── prompt_for_user_consent_js/
└── start_stop_rtms_control_js/
┌─────────────────────────────────────────────────────────────────────────────┐
│ Load Balancer (nginx/ALB) │
└─────────────────────────────────────────────────────────────────────────────┘
│
┌───────────────────────┼───────────────────────┐
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ RTMS Worker 1 │ │ RTMS Worker 2 │ │ RTMS Worker N │
│ (RTMSManager) │ │ (RTMSManager) │ │ (RTMSManager) │
└────────┬────────┘ └────────┬────────┘ └────────┬────────┘
│ │ │
└───────────────────────┼───────────────────────┘
▼
┌─────────────────────────────────┐
│ Message Queue (Redis/SQS) │
│ - Meeting assignments │
│ - Transcription jobs │
│ - Processing results │
└─────────────────────────────────┘
│
┌───────────────────────┼───────────────────────┐
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Transcription │ │ Transcription │ │ Transcription │
│ Service Pool │ │ Service Pool │ │ Service Pool │
│ (Deepgram) │ │ (AssemblyAI) │ │ (AWS/Fallback) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
// Cascading fallback pattern for transcription services
const transcriptionProviders = [
{ name: 'deepgram', client: deepgramClient, priority: 1, rateLimit: 100 },
{ name: 'assemblyai', client: assemblyClient, priority: 2, rateLimit: 50 },
{ name: 'aws', client: awsTranscribeClient, priority: 3, rateLimit: 200 },
];
class TranscriptionManager {
constructor(providers) {
this.providers = providers.sort((a, b) => a.priority - b.priority);
this.circuitBreakers = new Map();
// Initialize circuit breakers for each provider
for (const provider of providers) {
this.circuitBreakers.set(provider.name, new CircuitBreaker({
failureThreshold: 5,
resetTimeout: 30000,
}));
}
}
async transcribe(audioBuffer, meetingId) {
for (const provider of this.providers) {
const breaker = this.circuitBreakers.get(provider.name);
if (breaker.isOpen()) {
console.log(`[${provider.name}] Circuit open, skipping`);
continue;
}
try {
const result = await breaker.call(() =>
provider.client.transcribe(audioBuffer)
);
return { provider: provider.name, transcript: result };
} catch (error) {
console.error(`[${provider.name}] Failed: ${error.message}`);
// Continue to next provider
}
}
throw new Error('All transcription providers failed');
}
}
// Usage with RTMSManager
const transcriptionManager = new TranscriptionManager(transcriptionProviders);
RTMSManager.on('audio', async ({ buffer, meetingId }) => {
try {
const result = await transcriptionManager.transcribe(buffer, meetingId);
console.log(`Transcribed via ${result.provider}: ${result.transcript}`);
} catch (error) {
// All providers failed - queue for retry or alert
await deadLetterQueue.push({ buffer, meetingId, error: error.message });
}
});class CircuitBreaker {
constructor({ failureThreshold = 5, resetTimeout = 30000 }) {
this.failureThreshold = failureThreshold;
this.resetTimeout = resetTimeout;
this.failures = 0;
this.state = 'CLOSED'; // CLOSED, OPEN, HALF_OPEN
this.lastFailureTime = null;
}
isOpen() {
if (this.state === 'OPEN') {
// Check if we should try again
if (Date.now() - this.lastFailureTime >= this.resetTimeout) {
this.state = 'HALF_OPEN';
return false;
}
return true;
}
return false;
}
async call(fn) {
if (this.isOpen()) {
throw new Error('Circuit breaker is open');
}
try {
const result = await fn();
this.onSuccess();
return result;
} catch (error) {
this.onFailure();
throw error;
}
}
onSuccess() {
this.failures = 0;
this.state = 'CLOSED';
}
onFailure() {
this.failures++;
this.lastFailureTime = Date.now();
if (this.failures >= this.failureThreshold) {
this.state = 'OPEN';
}
}
}class MeetingPoolManager {
constructor({ maxConcurrentMeetings = 100, queueTimeout = 30000 }) {
this.maxConcurrent = maxConcurrentMeetings;
this.activeMeetings = new Map();
this.waitingQueue = [];
}
async acquireSlot(meetingId) {
if (this.activeMeetings.size < this.maxConcurrent) {
this.activeMeetings.set(meetingId, { startTime: Date.now() });
return true;
}
// Queue the meeting
return new Promise((resolve, reject) => {
const timeout = setTimeout(() => {
this.waitingQueue = this.waitingQueue.filter(w => w.meetingId !== meetingId);
reject(new Error(`Meeting ${meetingId} queue timeout`));
}, this.queueTimeout);
this.waitingQueue.push({ meetingId, resolve, reject, timeout });
});
}
releaseSlot(meetingId) {
this.activeMeetings.delete(meetingId);
// Process waiting queue
if (this.waitingQueue.length > 0) {
const next = this.waitingQueue.shift();
clearTimeout(next.timeout);
this.activeMeetings.set(next.meetingId, { startTime: Date.now() });
next.resolve(true);
}
}
getStats() {
return {
active: this.activeMeetings.size,
queued: this.waitingQueue.length,
maxConcurrent: this.maxConcurrent,
};
}
}
// Integration with RTMSManager
const meetingPool = new MeetingPoolManager({ maxConcurrentMeetings: 100 });
RTMSManager.on('meeting.rtms_started', async (payload) => {
try {
await meetingPool.acquireSlot(payload.meeting_uuid);
console.log(`Meeting ${payload.meeting_uuid} started. Pool: ${JSON.stringify(meetingPool.getStats())}`);
} catch (error) {
console.error(`Meeting ${payload.meeting_uuid} rejected: ${error.message}`);
// Optionally notify or handle overflow
}
});
RTMSManager.on('meeting.rtms_stopped', (payload) => {
meetingPool.releaseSlot(payload.meeting_uuid);
});// Error classification for appropriate handling
class RTMSErrorHandler {
static classify(error) {
const errorPatterns = {
RETRYABLE: [
/ECONNRESET/,
/ETIMEDOUT/,
/socket hang up/,
/503/,
/429/, // Rate limited
],
FATAL: [
/401/, // Auth failed
/403/, // Forbidden
/Invalid signature/,
],
RECOVERABLE: [
/ENOTFOUND/,
/WebSocket closed/,
],
};
for (const [type, patterns] of Object.entries(errorPatterns)) {
if (patterns.some(p => p.test(error.message))) {
return type;
}
}
return 'UNKNOWN';
}
static async handle(error, context) {
const type = this.classify(error);
switch (type) {
case 'RETRYABLE':
// Exponential backoff retry
await this.retryWithBackoff(context.retry, context.maxRetries || 3);
break;
case 'FATAL':
// Log, alert, don't retry
console.error(`Fatal error for meeting ${context.meetingId}: ${error.message}`);
await alerting.critical('RTMS Fatal Error', { error, context });
break;
case 'RECOVERABLE':
// Attempt reconnection
console.warn(`Recoverable error, reconnecting: ${error.message}`);
await RTMSManager.reconnect(context.meetingId);
break;
default:
console.error(`Unknown error: ${error.message}`);
await alerting.warning('RTMS Unknown Error', { error, context });
}
}
static async retryWithBackoff(fn, maxRetries, baseDelay = 1000) {
for (let i = 0; i < maxRetries; i++) {
try {
return await fn();
} catch (error) {
if (i === maxRetries - 1) throw error;
const delay = baseDelay * Math.pow(2, i) + Math.random() * 1000;
await new Promise(r => setTimeout(r, delay));
}
}
}
}// Health check endpoint for load balancers
app.get('/health', (req, res) => {
const health = {
status: 'ok',
timestamp: new Date().toISOString(),
uptime: process.uptime(),
meetings: {
active: RTMSManager.getActiveStreams().length,
poolStats: meetingPool.getStats(),
},
memory: process.memoryUsage(),
transcription: {
circuitBreakers: Object.fromEntries(
transcriptionManager.providers.map(p => [
p.name,
transcriptionManager.circuitBreakers.get(p.name).state
])
),
},
};
const isHealthy = health.meetings.active < meetingPool.maxConcurrent * 0.9;
res.status(isHealthy ? 200 : 503).json(health);
});
// Prometheus-style metrics endpoint
app.get('/metrics', (req, res) => {
const metrics = [
`rtms_active_meetings ${RTMSManager.getActiveStreams().length}`,
`rtms_queued_meetings ${meetingPool.getStats().queued}`,
`rtms_memory_heap_used ${process.memoryUsage().heapUsed}`,
`rtms_uptime_seconds ${process.uptime()}`,
];
res.set('Content-Type', 'text/plain');
res.send(metrics.join('\n'));
});async function gracefulShutdown(signal) {
console.log(`Received ${signal}. Starting graceful shutdown...`);
// 1. Stop accepting new meetings
server.close();
// 2. Wait for active meetings to complete (with timeout)
const shutdownTimeout = 30000;
const activeStreams = RTMSManager.getActiveStreams();
if (activeStreams.length > 0) {
console.log(`Waiting for ${activeStreams.length} active meetings...`);
await Promise.race([
Promise.all(activeStreams.map(s => RTMSManager.stopStream(s.streamId))),
new Promise(r => setTimeout(r, shutdownTimeout)),
]);
}
// 3. Cleanup resources
await RTMSManager.stop();
console.log('Graceful shutdown complete');
process.exit(0);
}
process.on('SIGTERM', () => gracefulShutdown('SIGTERM'));
process.on('SIGINT', () => gracefulShutdown('SIGINT'));┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Zoom Meeting │────▶│ Webhook Event │────▶│ Your Server │
│ │ │ meeting.rtms_ │ │ │
│ │ │ started │ │ │
└─────────────────┘ └──────────────────┘ └────────┬────────┘
│
┌──────────────────┐ │
│ Signaling WSS │◀─────────────┘
│ (Handshake) │
└────────┬─────────┘
│
┌────────▼─────────┐
│ Media WSS │
│ (Audio/Video/ │
│ Transcript) │
└────────┬─────────┘
│
┌────────▼─────────┐
│ Your Processing │
│ (NLP, Storage, │
│ Streaming) │
└──────────────────┘
The RTMSManager library handles connection management, reconnection, and event routing automatically:
import { RTMSManager } from './library/javascript/rtmsManager/RTMSManager.js';
await RTMSManager.init(config);
RTMSManager.on('audio', handleAudio);
RTMSManager.on('video', handleVideo);
RTMSManager.on('transcript', handleTranscript);
await RTMSManager.start();The RTMS SDK provides a simplified interface with built-in error handling:
- Automatic connection management
- Built-in reconnection logic
- Cross-platform compatibility
For maximum control, implement WebSocket connections directly:
- Manual handshake and authentication
- Custom reconnection strategies
- Direct binary data processing
-
Sign in: Go to https://marketplace.zoom.us/ with your RTMS-enabled account
-
Create App: Develop → Build App → General App → User-Managed
-
Configure Event Subscriptions:
- Features → Access → Enable Event Subscription
- Add Events → Search "rtms" → Select RTMS endpoints
-
Configure Scopes:
- Scopes → Add Scopes → Search "rtms"
- Add scopes for both "Meetings" and "Rtms"
-
Get Credentials:
- Client ID
- Client Secret
- Webhook verification token (Secret Token)
| Parameter | Options |
|---|---|
| Sample Rate | 8kHz, 16kHz, 24kHz, 32kHz, 48kHz |
| Codec | L16 (PCM), OPUS |
| Channels | Mono, Stereo |
| Data Option | Mixed stream, Individual streams |
| Parameter | Options |
|---|---|
| Codec | H.264, VP8 |
| Resolution | SD (640x360), HD (1280x720), FHD (1920x1080) |
| FPS | 1-30 |
| Data Option | Single active speaker, All participants |
| Parameter | Options |
|---|---|
| Language | English, Spanish, French, German, etc. |
| Content Type | Text |
- Verify ngrok/tunnel is running and accessible
- Check Zoom OAuth credentials in
.env - Ensure webhook URL is correctly configured in Zoom Marketplace
- Verify RTMS is enabled for your app (Zoom web settings)
- Check that your app has correct RTMS scopes
- Ensure you're handling the
meeting.rtms_startedwebhook event
- RTMS audio: L16 PCM at 16kHz/24kHz, mono
- FFmpeg params:
-f s16le -ar 16000 -ac 1 - Ensure FFmpeg is installed and in PATH
npm install github:zoom/rtmsEnsure you have the correct token for fetching prebuilt binaries.
MIT License - Copyright (c) 2025 Zoom Video Communications, Inc.
See LICENSE for full text.