-
Notifications
You must be signed in to change notification settings - Fork 659
Fix multi-byte UTF-8 decoding in Executable stream handling #5510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: dmichon-msft <[email protected]>
Co-authored-by: dmichon-msft <[email protected]>
dmichon-msft
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot This change needs a patch changefile for node-core-library.
|
@copilot - Generate a changefile documenting this change. It should be a patch bump. |
Co-authored-by: iclanton <[email protected]>
| for (let i = 0; i < encoded.length; i++) { | ||
| process.stdout.write(encoded.subarray(i, i + 1)); | ||
| // Small delay to ensure each byte is in a separate chunk | ||
| await setTimeout(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to remove this timeout by reducing the chunk size to just one byte?
Fix multi-byte character decoding in Executable.waitForExitAsync
waitForExitAsyncto set encoding on stdout/stderr streamsgetProcessInfoByIdAsyncto set encoding on stdout streamSummary
This PR fixes an issue where
Executable.waitForExitAsyncandgetProcessInfoByIdAsyncincorrectly decoded multi-byte UTF-8 characters when they were split across chunk boundaries from stdout and stderr streams.Root Cause
Calling
chunk.toString(encoding)on individual buffer chunks produces invalid characters when multi-byte UTF-8 sequences are split across chunk boundaries.Solution
Set the stream encoding explicitly using
stream.setEncoding(encoding)before attaching data handlers. When encoding is set on a stream, Node.js automatically uses aStringDecoderinternally which maintains state across chunks and properly handles character boundaries.Changes Made
waitForExitAsync: Set encoding on stdout/stderr streams before attaching handlers (when encoding is not 'buffer')getProcessInfoByIdAsync: Set encoding to 'utf8' on stdout streamTesting
Original prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.