Skip to content

fix: handle non-JSON-serializable types in serialization fallback (AI-Triage PR)#954

Draft
devin-ai-integration[bot] wants to merge 2 commits intomainfrom
devin/1773523110-fix-serialization-fallback-complex-types
Draft

fix: handle non-JSON-serializable types in serialization fallback (AI-Triage PR)#954
devin-ai-integration[bot] wants to merge 2 commits intomainfrom
devin/1773523110-fix-serialization-fallback-complex-types

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Mar 14, 2026

Summary

Adds a second fallback in airbyte_message_to_string() to prevent unhandled exceptions (and consequent deadlocks in the concurrent source pipeline) when record data contains types that neither orjson nor stdlib json can serialize.

Problem: The existing fallback chain is orjson.dumps()json.dumps(). But json.dumps() can also raise TypeError for types like Python complex numbers. When this unhandled exception occurs in a worker thread of the concurrent source pipeline, the main thread deadlocks waiting on queue.get() because the worker silently dies.

Fix: Wrap the json.dumps() fallback in its own try/except, falling back to json.dumps(serialized_message, default=str) which converts any remaining non-serializable values to their string representation.

This is a last-resort path — it only activates when both orjson and stdlib json fail on the same message.

Resolves https://github.com/airbytehq/oncall/issues/11654:

Related: airbytehq/airbyte#74883

Review & Testing Checklist for Human

  • Silent data coercion is acceptable: The default=str fallback silently converts non-serializable values to string representations (e.g., complex(1,2)"(1+2j)"). No additional warning is logged when this second fallback fires — only the original orjson warning is emitted. Decide if this is acceptable or if an extra log line should be added.
  • Broad except Exception is appropriate here: The inner catch is intentionally broad since this is a last-resort to prevent deadlocks. Confirm this doesn't mask errors that should propagate.
  • Verify the fix addresses the reported deadlock scenario: The original issue reports deadlocks in source-google-search-console with complex type values in ctr/position fields. The test covers this path but an integration-level verification with the actual connector would provide higher confidence.

Suggested test plan: Run source-google-search-console with the search_analytics_by_query stream against a real account. Confirm the sync completes without deadlock and that ctr/position values are present (even if stringified as a fallback).

Notes

Link to Devin session: https://app.devin.ai/sessions/d8317c1f4ce64f70b5e807425b72aca2

The json.dumps() fallback in airbyte_message_to_string() could also fail
for types that neither orjson nor stdlib json can serialize (e.g. complex
numbers), causing an unhandled exception that leads to deadlocks in the
concurrent source pipeline.

Add a second fallback using json.dumps(default=str) to ensure
serialization never raises an unhandled exception.

Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@devin/1773523110-fix-serialization-fallback-complex-types#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch devin/1773523110-fix-serialization-fallback-complex-types

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test - Runs connector tests with the updated CDK
  • /prerelease - Triggers a prerelease publish with default arguments
  • /poe build - Regenerate git-committed build artifacts, such as the pydantic models which are generated from the manifest JSON schema in YAML.
  • /poe <command> - Runs any poe command in the CDK environment
📚 Show Repo Guidance

Helpful Resources

📝 Edit this welcome message.

@github-actions
Copy link

github-actions bot commented Mar 14, 2026

PyTest Results (Fast)

3 935 tests  +1   3 923 ✅ +1   6m 42s ⏱️ -27s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit 8320500. ± Comparison against base commit 0e57414.

♻️ This comment has been updated with latest results.

@github-actions
Copy link

github-actions bot commented Mar 14, 2026

PyTest Results (Full)

3 938 tests  +1   3 926 ✅ +1   11m 37s ⏱️ +23s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit 8320500. ± Comparison against base commit 0e57414.

♻️ This comment has been updated with latest results.

Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants