feat(copaw): propagate traces to child agents and suppress duplicate entry spans#164
feat(copaw): propagate traces to child agents and suppress duplicate entry spans#164Cirilla-zmh wants to merge 2 commits into
Conversation
Change-Id: Iadc503014273de1c9455ff6f7518474083dbf1a8 Co-developed-by: Cursor <noreply@cursor.com>
Change-Id: Ibfc5a4ad46670bd1bb18bcefcad95f876ebf389d Co-developed-by: Cursor <noreply@cursor.com>
There was a problem hiding this comment.
Pull request overview
This PR enhances the CoPaw OpenTelemetry instrumentation to keep a single trace across parent/child CoPaw agent processes invoked via AgentScope shell tooling, while avoiding duplicate “entry” spans in child processes.
Changes:
- Injects current trace context into
execute_shell_commandsubprocess environments forcopaw agents chat(or via opt-in env flag). - Suppresses
enter_ai_application_systementry span creation in child CoPaw processes and attaches extracted parent context instead. - Adds supporting modules, tests validating propagation/suppression behavior, and updates documentation + changelog.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| instrumentation-loongsuite/loongsuite-instrumentation-copaw/src/opentelemetry/instrumentation/copaw/_shell_patch.py | Wraps AgentScope shell execution to inject trace env and mark child processes. |
| instrumentation-loongsuite/loongsuite-instrumentation-copaw/src/opentelemetry/instrumentation/copaw/patch.py | Updates query_handler wrapper to suppress entry in child mode and attach extracted context. |
| instrumentation-loongsuite/loongsuite-instrumentation-copaw/src/opentelemetry/instrumentation/copaw/_env_carrier.py | Adds env getter/setter carrier for propagator injection/extraction. |
| instrumentation-loongsuite/loongsuite-instrumentation-copaw/src/opentelemetry/instrumentation/copaw/_constants.py | Centralizes env var names and child-process detection helper. |
| instrumentation-loongsuite/loongsuite-instrumentation-copaw/src/opentelemetry/instrumentation/copaw/init.py | Instruments/uninstruments AgentScope execute_shell_command in addition to query_handler. |
| instrumentation-loongsuite/loongsuite-instrumentation-copaw/tests/test_shell_propagate.py | Tests shell command matching and env injection of TRACEPARENT + child marker. |
| instrumentation-loongsuite/loongsuite-instrumentation-copaw/tests/test_child_entry_suppression.py | Tests child mode suppresses entry span emission. |
| instrumentation-loongsuite/loongsuite-instrumentation-copaw/README.md | Documents sub-agent CLI trace continuity and the opt-in forced injection flag. |
| instrumentation-loongsuite/loongsuite-instrumentation-copaw/CHANGELOG.md | Notes the multi-agent trace propagation and child entry suppression feature. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| await asyncio.wait_for(proc.wait(), timeout=timeout) | ||
| stdout, stderr = await proc.communicate() |
| env = _build_subprocess_env() | ||
| try: | ||
| return await _run_shell_command_with_env(command, timeout, env) | ||
| except Exception: | ||
| logger.debug( | ||
| "%s.%s inject path failed; falling back to original", | ||
| _MODULE_SHELL, | ||
| _PATCH_TARGET, | ||
| exc_info=True, | ||
| ) | ||
| return await wrapped(*args, **kwargs) |
| try: | ||
| proc.terminate() | ||
| stdout, stderr = await proc.communicate() | ||
| stdout_str = stdout.decode("utf-8") | ||
| stderr_str = stderr.decode("utf-8") | ||
| if stderr_str: | ||
| stderr_str += f"\n{stderr_suffix}" | ||
| else: | ||
| stderr_str = stderr_suffix | ||
| except ProcessLookupError: | ||
| stdout_str = "" | ||
| stderr_str = stderr_suffix |
| ) | ||
| except Exception: | ||
| logger.debug("Failed to inject trace into env", exc_info=True) | ||
| return merged |
|
This PR has been automatically marked as stale because it has not had any activity for 14 days. It will be closed if no further activity occurs within 14 days of this comment. |
|
We need to progress on this so that it won't be closed. |
|
This PR has been automatically marked as stale because it has not had any activity for 14 days. It will be closed if no further activity occurs within 14 days of this comment. |
ralf0131
left a comment
There was a problem hiding this comment.
Review by github-manager-bot
Summary
Propagates W3C trace context across CoPaw parent→child processes by wrapping AgentScope's execute_shell_command (injects TRACEPARENT/TRACESTATE + COPAW_OTEL_CHILD_AGENT=1) and, in the child, suppresses the duplicate enter_ai_application_system entry span while attaching the parent context.
Findings
- [Warning]
_shell_patch.py:_run_shell_command_with_env— when the trace-inject path is taken, onlycommandandtimeoutare forwarded to the re-implemented subprocess call. Any additional**kwargsthe upstreamexecute_shell_commandaccepts (e.g.cwd,env,shell) are silently dropped on the inject path (theexceptfallback calls the originalwrapped(*args, **kwargs), but the success path does not). If upstream grows new parameters, the instrumented invocation will quietly lose them. Consider forwarding**kwargs(or at least documenting the supported subset) so the wrapper does not diverge. - [Warning]
_shell_patch.py—_run_shell_command_with_envduplicates the upstream subprocess behavior. This is a maintenance risk: if upstreamexecute_shell_commandchanges its return contract, error handling, or working-directory logic, the duplicate won't track it. Worth a comment linking to the upstream source version it mirrors. - [Info]
_shell_patch.py:_build_subprocess_env— the fail-safe behavior on inject error is good (returnsmergedwithoutCOPAW_OTEL_CHILD_AGENT, so the child neither suppresses entry nor gets partial context). - [Info]
patch.py— child mode correctly gates bothstart_entryand theGeneratorExitentry-cleanup, and thefinallydetaches the context token. Solid.
Suggestions
For the kwargs concern, a minimal safe change:
async def _run_shell_command_with_env(command, env, timeout, **_ignored):
...…or, better, pass through the remaining kwargs to avoid silently dropping them.
Cross-repo Note
None — instrumentation is self-contained. (Related: alibaba/loongsuite-pilot does cross-process collection but is unaffected.)
Automated review by github-manager-bot
Description
What changed
Shell subprocess: trace propagation (
multi_agent_collaboration)When AgentScope’s
execute_shell_commandruns a command that looks like a CoPaw sub-agent chat (copaw+agents+chatin the command string), the instrumentation merges the current trace context into the subprocessenvusing W3CTRACEPARENT/TRACESTATE(and configured propagators), aligned with OpenTelemetry’s environment carrier semantics. It setsCOPAW_OTEL_CHILD_AGENT=1so the child recognizes the role. An opt-inCOPAW_OTEL_INJECT_SHELL_TRACEforces injection for every shell invocation (documented as advanced / risky for non-CoPaw children).Entry span behavior in child processes
AgentRunner.query_handleris updated so that whenCOPAW_OTEL_CHILD_AGENTindicates a child agent process, it does not create a newenter_ai_application_systemspan; it attaches context extracted from the environment so AgentScope and other spans continue in the same trace as the parent.Supporting modules
_env_carrier.py— local environment getter/setter for propagators (mirrors upstream_envcarrierwhere the published wheel may not expose it)._constants.py— names and helpers forCOPAW_OTEL_CHILD_AGENT, inject flags, and child-process detection._shell_patch.py— wrapsexecute_shell_commandto merge env and delegate to the same async shell behavior with explicitenv=.Tests
instrumentation-loongsuite/loongsuite-instrumentation-copaw/tests/test_shell_propagate.py— asserts trace env injection andCOPAW_OTEL_CHILD_AGENTfor matching shell commands (and related behavior).instrumentation-loongsuite/loongsuite-instrumentation-copaw/tests/test_child_entry_suppression.py— asserts child mode suppresses entry creation while attaching propagated context.Documentation
instrumentation-loongsuite/loongsuite-instrumentation-copaw/README.md— new section Sub-agent CLI and trace continuity (multi_agent_collaboration), including baggage /OTEL_PROPAGATORSnotes and advancedCOPAW_OTEL_INJECT_SHELL_TRACEwarning.Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
Does This PR Require a Core Repo Change?
Checklist:
See contributing.md for styleguide, changelog guidelines, and more.