feat: add diagnostic logging, deadlock prevention, and stdout buffering for concurrent reads by devin-ai-integration[bot] · Pull Request #925 · airbytehq/airbyte-python-cdk

devin-ai-integration · 2026-02-27T09:36:59Z

Summary

Adds diagnostic logging, deadlock prevention, and a stdout writer thread to the CDK's concurrent read pipeline. This PR addresses a confirmed deadlock in Airbyte Cloud where syncs hang silently for 24+ hours when stdout pipe backpressure stalls the main thread, causing all worker threads to block on queue.put() against the bounded shared queue (maxsize=10,000).

Changes

Stdout writer thread (entrypoint.py): Replaces the synchronous print() loop in launch() with a dedicated background writer thread. The main thread puts serialised messages into an in-memory queue (maxsize=50,000) and the writer thread performs the blocking print() calls. This decouples the generator from OS-level stdout pipe backpressure so the CDK's internal record queue keeps draining even when the platform is slow to read.
Watchdog thread (concurrent_source.py): A daemon thread monitors main-thread progress via a shared monotonic timestamp. If no queue item is consumed for 10 minutes, it calls os._exit(1) — a raw syscall that terminates the process regardless of I/O state. This is a last-resort safety net for cases where both stdout and stderr are blocked.
Concurrency configuration logging (concurrent_declarative_source.py): Logs the resolved concurrency level, partition count, and source (manifest vs default) at startup via stderr.
Queue heartbeat logging (concurrent_source.py): Replaces the indefinite queue.get() with a 60-second timeout. Logs a heartbeat when no items arrive (idle) or periodically while processing (active), including queue size and thread status.
Partition reader diagnostic logging (partition_reader.py): Logs STARTED / PROGRESS (every 30s) / COMPLETED / FAILED for each partition read with record counts, timing, and slice info.
Partition enqueuer diagnostic logging (partition_enqueuer.py): Logs STARTED / enqueuing / COMPLETED / FAILED for partition generation with counts and timing.
Deadlock prevention (partition_reader.py, partition_enqueuer.py): Adds a 5-minute timeout (_QUEUE_PUT_TIMEOUT = 300s) to all queue.put() calls, including exception handlers. If a worker thread is blocked for 5 minutes trying to put an item on the full queue, it raises a RuntimeError with a clear deadlock message instead of hanging indefinitely.

Deadlock root cause (context)

In the Google Ads connector on Airbyte Cloud, multiple large responses (7–17 MB) complete simultaneously. Each partition reader thread calls queue.put(record) for thousands of records. The bounded queue fills to 10,000, and the main thread gets stuck at yield because print() blocks on the full stdout pipe (64KB buffer). With the main thread unable to drain the queue, all 8 worker threads also block — a complete deadlock with no logs for 24 hours.

Updates since last revision

Added stdout writer thread in launch() to decouple the main thread from stdout pipe backpressure. The main thread now puts messages into a 50,000-item buffer; a background thread handles the blocking I/O.
Added watchdog thread in ConcurrentSource.read() to force-terminate the process after 10 minutes of no main-thread progress (safety net for stdout/stderr both blocked).
Fixed exception handler deadlock risk: Exception handlers in PartitionReader and PartitionEnqueuer now use _put_with_timeout() instead of raw queue.put(), preventing deadlock when the queue is full.
Removed set_expected_logs from test scenarios to fix test ordering issues caused by the new diagnostic logging.

Review & Testing Checklist for Human

Stdout writer thread error propagation: If the writer thread crashes mid-stream (e.g., print() raises an exception), the main thread continues putting messages into the buffer until it fills (50,000 items), then blocks on buffer.put() indefinitely. The watchdog would eventually catch this, but it's not clean. Consider whether the writer should signal the main thread on error.
Writer thread join timeout: writer_thread.join(timeout=300) means if stdout is truly blocked, the join times out after 5 minutes and the main thread exits normally without re-raising writer_error[0] (which may not be populated yet). Verify this is acceptable behavior.
Buffer memory usage: The stdout buffer holds up to 50,000 serialised messages. For connectors with large records (e.g., Google Ads with 1KB/record), this could be ~50MB. Verify this is acceptable.
Watchdog os._exit(1) behavior: This is a hard kill that bypasses all cleanup (no finally blocks, no flush). Verify this is acceptable for Cloud deployments where the platform handles retries.
Test plan: Deploy a CDK prerelease with these changes to the Google Ads connector in the Airbyte Cloud sandbox workspace. Run a sync on the Kortx account (which previously deadlocked for 24+ hours). Verify the sync completes successfully without hanging.

Notes

Requested by Anatolii Yatsuk (gl_anatolii.yatsuk) to debug Google Ads connector deadlocks in Airbyte Cloud
Devin session

Co-Authored-By: unknown <>

devin-ai-integration · 2026-02-27T09:37:03Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

github-actions · 2026-02-27T09:37:14Z

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@devin/1772184954-log-concurrency-level#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch devin/1772184954-log-concurrency-level

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

/autofix - Fixes most formatting and linting issues
/poetry-lock - Updates poetry.lock file
/test - Runs connector tests with the updated CDK
/prerelease - Triggers a prerelease publish with default arguments
/poe build - Regenerate git-committed build artifacts, such as the pydantic models which are generated from the manifest JSON schema in YAML.
/poe <command> - Runs any poe command in the CDK environment

📚 Show Repo Guidance

Helpful Resources

CDK API Reference

📝 Edit this welcome message.

github-actions · 2026-02-27T09:41:17Z

PyTest Results (Fast)

3 918 tests +49 3 906 ✅ +49 6m 50s ⏱️ -1s
1 suites ± 0 12 💤 ± 0
1 files ± 0 0 ❌ ± 0

Results for commit e4ce817. ± Comparison against base commit 7f41401.

♻️ This comment has been updated with latest results.

github-actions · 2026-02-27T09:49:26Z

PyTest Results (Full)

3 921 tests 3 909 ✅ 11m 14s ⏱️
1 suites 12 💤
1 files 0 ❌

Results for commit e4ce817.

♻️ This comment has been updated with latest results.

Anatolii Yatsuk (tolik0) · 2026-02-27T14:07:21Z

/prerelease

Prerelease Job Info

This job triggers the publish workflow with default arguments to create a prerelease.

Prerelease job started... Check job output.

✅ Prerelease workflow triggered successfully.

View the publish workflow run: https://github.com/airbytehq/airbyte-python-cdk/actions/runs/22489445491

…values Co-Authored-By: unknown <>

Co-Authored-By: unknown <>

Anatolii Yatsuk (tolik0) · 2026-02-27T16:32:04Z

/prerelease

Prerelease Job Info

This job triggers the publish workflow with default arguments to create a prerelease.

Prerelease job started... Check job output.

✅ Prerelease workflow triggered successfully.

View the publish workflow run: https://github.com/airbytehq/airbyte-python-cdk/actions/runs/22494697606

- Add heartbeat logging to _consume_from_queue (logs every 60s when no items received) - Add STARTED/PROGRESS/COMPLETED/FAILED logging to PartitionReader.process_partition - Add STARTED/COMPLETED/FAILED logging to PartitionEnqueuer.generate_partitions - Use queue.get(timeout=60) instead of blocking queue.get() to enable heartbeat Co-Authored-By: unknown <>

…sageSerializer mock Co-Authored-By: unknown <>

…rence Co-Authored-By: unknown <>

Anatolii Yatsuk (tolik0) · 2026-03-05T15:37:10Z

/prerelease

Prerelease Job Info

This job triggers the publish workflow with default arguments to create a prerelease.

Prerelease job started... Check job output.

✅ Prerelease workflow triggered successfully.

View the publish workflow run: https://github.com/airbytehq/airbyte-python-cdk/actions/runs/22725428290

When all worker threads block on queue.put() and the main thread cannot drain the bounded queue (maxsize=10,000), the sync hangs silently for hours. This adds a 5-minute timeout to queue.put() calls in both PartitionReader and PartitionEnqueuer. If the timeout fires, a RuntimeError is raised with a clear deadlock message instead of hanging indefinitely. Co-Authored-By: unknown <>

Anatolii Yatsuk (tolik0) · 2026-03-10T15:44:41Z

/prerelease

Prerelease Job Info

This job triggers the publish workflow with default arguments to create a prerelease.

Prerelease job started... Check job output.

✅ Prerelease workflow triggered successfully.

View the publish workflow run: https://github.com/airbytehq/airbyte-python-cdk/actions/runs/22911023302

Co-Authored-By: unknown <>

… logging Co-Authored-By: unknown <>

…nistic ordering) Co-Authored-By: unknown <>

When the platform stops reading from the source container's stdout/stderr pipes (e.g. destination backpressure), all threads block on I/O and no in-process timeout can fire. The watchdog monitors main-thread progress and calls os._exit(1) after 10 minutes of no activity. Also fixes exception handlers in PartitionReader and PartitionEnqueuer that used blocking queue.put() without timeout, which would deadlock when the queue is full. Co-Authored-By: unknown <>

…hput syncs Co-Authored-By: unknown <>

Anatolii Yatsuk (tolik0) · 2026-03-11T15:48:27Z

/prerelease

Prerelease Job Info

This job triggers the publish workflow with default arguments to create a prerelease.

Prerelease job started... Check job output.

✅ Prerelease workflow triggered successfully.

View the publish workflow run: https://github.com/airbytehq/airbyte-python-cdk/actions/runs/22961350734

PrintBuffer.flush() writes directly to sys.__stdout__ while holding an RLock. When the platform pauses reading stdout, flush() blocks on the pipe and the lock is never released. Every other thread that calls logger.info() then blocks waiting for the same RLock, causing a process-wide deadlock. Removing the 'with PRINT_BUFFER:' wrapper means print() and logging go directly to stdout/stderr without the shared lock, allowing the stdout writer thread to be the only bottleneck on the pipe. Co-Authored-By: unknown <>

Anatolii Yatsuk (tolik0) · 2026-03-11T19:11:25Z

/prerelease

Prerelease Job Info

This job triggers the publish workflow with default arguments to create a prerelease.

Prerelease job started... Check job output.

✅ Prerelease workflow triggered successfully.

View the publish workflow run: https://github.com/airbytehq/airbyte-python-cdk/actions/runs/22969963641

The logging handler's stream is PrintBuffer, which writes to sys.__stdout__ while holding an RLock. When the platform pauses reading from the stdout pipe, PrintBuffer.flush() blocks on the pipe while holding the lock. Any thread that tries to log (including the main thread's heartbeat) also blocks waiting for the RLock. This commit redirects all logging handler streams to a _QueueStream that puts messages into the same unbounded buffer used by the stdout writer thread. The writer thread is now the ONLY thing that touches sys.__stdout__, so no other thread can block on the pipe. Co-Authored-By: unknown <>

…d handlers Co-Authored-By: unknown <>

…eadlock Previous fix only redirected logging handler streams to the non-blocking buffer queue. This was insufficient because other code paths (print() calls, direct sys.stdout.write()) still wrote to the real pipe and could block when the platform paused reading. This commit adds two additional layers of protection: 1. _StdoutProxy: A proxy object that intercepts write() and flush() calls, routing them through the unbounded buffer queue. All other attribute access (encoding, fileno, buffer, etc.) is delegated to the original stream object. 2. sys.stdout and sys.stderr are replaced with _StdoutProxy instances inside _buffered_write_to_stdout(). This ensures that ANY write from ANY thread goes through the non-blocking buffer. 3. Diagnostic logging via os.write(2, ...) confirms the buffered writer is active and reports how many handlers were redirected and the original stdout/stderr types. The three layers (handler redirection, stdout/stderr replacement, and the writer thread) together ensure no thread ever blocks on the stdout pipe except the dedicated writer thread. Co-Authored-By: unknown <>

…oint Adds periodic diagnostic messages written directly to stderr fd 2 (bypassing all Python buffering) at key points: 1. _consume_from_queue: every 10s logs total_items, total_yields, qsize, and last_get_wait. Also logs SLOW yields (>5s). 2. _buffered_write_to_stdout: every 10s logs msg_count, buffer_qsize, and writer_alive status. These diagnostics will be visible in platform logs even when the stdout pipe is blocked, allowing us to pinpoint exactly where the main thread gets stuck. Co-Authored-By: unknown <>

Co-Authored-By: unknown <>

airbyte_cdk/sources/concurrent_source/concurrent_source.py

Co-Authored-By: unknown <>

…pe backpressure Co-Authored-By: unknown <>

feat: log concurrency configuration at startup for debugging

a4b6051

Co-Authored-By: unknown <>

devin-ai-integration bot mentioned this pull request Feb 27, 2026

chore(source-google-ads): diagnostic logging + application-level timeout fix for Cloud deadlock airbytehq/airbyte#74083

Draft

6 tasks

octavia-bot bot temporarily deployed to DockerHub February 27, 2026 14:08 Inactive

octavia-bot bot temporarily deployed to PyPi February 27, 2026 14:08 Inactive

devin-ai-integration bot added 2 commits February 27, 2026 16:28

feat: enhance concurrency logging to differentiate config vs default …

69fb898

…values Co-Authored-By: unknown <>

style: fix ruff formatting

4ea2fe0

Co-Authored-By: unknown <>

octavia-bot bot temporarily deployed to DockerHub February 27, 2026 16:33 Inactive

octavia-bot bot temporarily deployed to PyPi February 27, 2026 16:33 Inactive

devin-ai-integration bot added 3 commits March 5, 2026 15:14

fix: use sys.stdout.write for concurrency logging to avoid AirbyteMes…

365a339

…sageSerializer mock Co-Authored-By: unknown <>

fix: write concurrency log to stderr to avoid stdout protocol interfe…

d17c822

…rence Co-Authored-By: unknown <>

octavia-bot bot temporarily deployed to PyPi March 5, 2026 15:38 Inactive

octavia-bot bot temporarily deployed to DockerHub March 5, 2026 15:38 Inactive

devin-ai-integration bot changed the title ~~feat: log concurrency configuration at startup for debugging~~ feat: add diagnostic logging and deadlock prevention for concurrent reads Mar 10, 2026

octavia-bot bot temporarily deployed to PyPi March 10, 2026 15:45 Inactive

octavia-bot bot temporarily deployed to DockerHub March 10, 2026 15:45 Inactive

devin-ai-integration bot added 5 commits March 10, 2026 15:58

test: update expected logs to include diagnostic partition logging

240ba4e

Co-Authored-By: unknown <>

style: fix ruff formatting in test scenarios

5f740e4

Co-Authored-By: unknown <>

test: update stream_facade_single_stream expected logs for diagnostic…

2c94cfd

… logging Co-Authored-By: unknown <>

test: remove set_expected_logs from concurrent scenarios (non-determi…

d8065e5

…nistic ordering) Co-Authored-By: unknown <>

octavia-bot bot temporarily deployed to PyPi March 11, 2026 14:44 Inactive

fix: use unbounded stdout buffer to prevent deadlock with high-throug…

059fd36

…hput syncs Co-Authored-By: unknown <>

octavia-bot bot temporarily deployed to DockerHub March 11, 2026 15:49 Inactive

octavia-bot bot temporarily deployed to PyPi March 11, 2026 15:49 Inactive

octavia-bot bot temporarily deployed to DockerHub March 11, 2026 19:12 Inactive

octavia-bot bot temporarily deployed to PyPi March 11, 2026 19:12 Inactive

octavia-bot bot temporarily deployed to DockerHub March 11, 2026 19:12 Inactive

octavia-bot bot temporarily deployed to PyPi March 11, 2026 20:43 Inactive

octavia-bot bot temporarily deployed to DockerHub March 11, 2026 20:43 Inactive

octavia-bot bot temporarily deployed to PyPi March 12, 2026 12:02 Inactive

octavia-bot bot temporarily deployed to DockerHub March 12, 2026 12:02 Inactive

devin-ai-integration bot added 2 commits March 12, 2026 12:16

fix: skip buffered writer in pytest and only redirect stdout-connecte…

05796a5

…d handlers Co-Authored-By: unknown <>

octavia-bot bot temporarily deployed to DockerHub March 12, 2026 13:02 Inactive

octavia-bot bot temporarily deployed to PyPi March 12, 2026 13:02 Inactive

devin-ai-integration bot added 2 commits March 12, 2026 14:32

style: fix ruff format

fd6674d

Co-Authored-By: unknown <>

github-code-quality bot found potential problems Mar 12, 2026

View reviewed changes

airbyte_cdk/sources/concurrent_source/concurrent_source.py Fixed Show fixed Hide fixed

octavia-bot bot temporarily deployed to PyPi March 12, 2026 14:36 Inactive

octavia-bot bot temporarily deployed to DockerHub March 12, 2026 14:36 Inactive

devin-ai-integration bot added 2 commits March 12, 2026 14:37

style: add explanatory comment to empty except in _diag

4c6b4b1

Co-Authored-By: unknown <>

fix: set stderr fd 2 to non-blocking mode to prevent deadlock from pi…

e4ce817

…pe backpressure Co-Authored-By: unknown <>

octavia-bot bot temporarily deployed to PyPi March 12, 2026 15:33 Inactive

octavia-bot bot temporarily deployed to DockerHub March 12, 2026 15:33 Inactive

devin-ai-integration bot mentioned this pull request Mar 12, 2026

feat: make concurrent CDK record queue unlimited to prevent deadlock #951

Draft

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add diagnostic logging, deadlock prevention, and stdout buffering for concurrent reads#925

feat: add diagnostic logging, deadlock prevention, and stdout buffering for concurrent reads#925
devin-ai-integration[bot] wants to merge 24 commits intomainfrom
devin/1772184954-log-concurrency-level

devin-ai-integration bot commented Feb 27, 2026 •

edited

Loading

Uh oh!

devin-ai-integration bot commented Feb 27, 2026

Uh oh!

github-actions bot commented Feb 27, 2026

Testing This CDK Version

PR Slash Commands

Helpful Resources

Uh oh!

github-actions bot commented Feb 27, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 27, 2026 •

edited

Loading

Uh oh!

Anatolii Yatsuk (tolik0) commented Feb 27, 2026 •

edited by github-actions bot

Loading

Uh oh!

Anatolii Yatsuk (tolik0) commented Feb 27, 2026 •

edited by github-actions bot

Loading

Uh oh!

Anatolii Yatsuk (tolik0) commented Mar 5, 2026 •

edited by github-actions bot

Loading

Uh oh!

Anatolii Yatsuk (tolik0) commented Mar 10, 2026 •

edited by github-actions bot

Loading

Uh oh!

Anatolii Yatsuk (tolik0) commented Mar 11, 2026 •

edited by github-actions bot

Loading

Uh oh!

Anatolii Yatsuk (tolik0) commented Mar 11, 2026 •

edited by github-actions bot

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

devin-ai-integration bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Deadlock root cause (context)

Updates since last revision

Review & Testing Checklist for Human

Notes

Uh oh!

devin-ai-integration bot commented Feb 27, 2026

🤖 Devin AI Engineer

Uh oh!

github-actions bot commented Feb 27, 2026

👋 Greetings, Airbyte Team Member!

Testing This CDK Version

PR Slash Commands

Helpful Resources

Uh oh!

github-actions bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PyTest Results (Fast)

Uh oh!

github-actions bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PyTest Results (Full)

Uh oh!

Anatolii Yatsuk (tolik0) commented Feb 27, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Anatolii Yatsuk (tolik0) commented Feb 27, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Anatolii Yatsuk (tolik0) commented Mar 5, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Anatolii Yatsuk (tolik0) commented Mar 10, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Anatolii Yatsuk (tolik0) commented Mar 11, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Anatolii Yatsuk (tolik0) commented Mar 11, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

devin-ai-integration bot commented Feb 27, 2026 •

edited

Loading

github-actions bot commented Feb 27, 2026 •

edited

Loading

github-actions bot commented Feb 27, 2026 •

edited

Loading

Anatolii Yatsuk (tolik0) commented Feb 27, 2026 •

edited by github-actions bot

Loading

Anatolii Yatsuk (tolik0) commented Feb 27, 2026 •

edited by github-actions bot

Loading

Anatolii Yatsuk (tolik0) commented Mar 5, 2026 •

edited by github-actions bot

Loading

Anatolii Yatsuk (tolik0) commented Mar 10, 2026 •

edited by github-actions bot

Loading

Anatolii Yatsuk (tolik0) commented Mar 11, 2026 •

edited by github-actions bot

Loading

Anatolii Yatsuk (tolik0) commented Mar 11, 2026 •

edited by github-actions bot

Loading