-
-
Notifications
You must be signed in to change notification settings - Fork 33.9k
gh-143378: Fix UAF when BytesIO is concurrently mutated during write operations
#143408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…er access PyObject_GetBuffer() can execute user code (e.g. via __buffer__), which may close or otherwise mutate a BytesIO object while write() or writelines() is in progress. This could invalidate the internal buffer and lead to a use-after-free. Temporarily bump the exports counter while acquiring the input buffer to block re-entrant mutation, and add regression tests to ensure such cases raise BufferError instead of crashing.
picnixz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check if we also have an exception for the pure Python implementation?
Misc/NEWS.d/next/Library/2026-01-03-19-41-36.gh-issue-143378.29AvE7.rst
Outdated
Show resolved
Hide resolved
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
BytesIO is concurrently mutated during write operations
Hi I can confirm currently we don't have an exception catching the concurrent mutation during a write in Python implementation. Should we keep this behavior consistent between C and Python version? I'd be happy to add this logic check and unit tests as well. A potential patch for python version would be something like: diff --git a/Lib/_pyio.py b/Lib/_pyio.py
index 69a088df8f..5b684c8f46 100644
--- a/Lib/_pyio.py
+++ b/Lib/_pyio.py
@@ -955,6 +955,8 @@ def write(self, b):
raise TypeError("can't write str to binary stream")
with memoryview(b) as view:
n = view.nbytes # Size of any bytes-like object
+ if self.closed:
+ raise ValueError("write to closed file")
if n == 0:
return 0 |
Co-authored-by: Bénédikt Tran <[email protected]>
Co-authored-by: Bénédikt Tran <[email protected]>
Co-authored-by: Bénédikt Tran <[email protected]>
|
I have made the requested changes; please review again. |
|
Thanks for making the requested changes! @picnixz: please review the changes made to this pull request. |
I think but let's ask @cmaloney and @serhiy-storchaka about this. Sometimes it's not a good idea to force an exception just for consistency and it may not be useful either (and we apply GIGO paradigm in this case) |
serhiy-storchaka
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that it would be better to call check_closed() and check_exports() after we get a buffer.
Correspondingly, in the Python code we should perform these checks inside the with memoryview(...) block.
serhiy-storchaka
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the stream is closed concurrently with writing, BufferError is not an expected exception. You should get a usual exception that you get when trying to write to already closed stream.
Please add also a test for writelines().
It would also be interesting to add a test when the stream buffer is exported concurrently with writing -- in that case BufferError can be an expected exception.
…ython implementation
Sounds great! I've updated the code following the suggestion. It indeed improves consistency and matches the expected behavior. Regarding the unit tests, I've added coverage for concurrent export and close for both |
serhiy-storchaka
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. 👍
You can either merge tests for write() and writelines() since they share the same class (it only needs copying the last two lines). Or leave it as is. Either is fine.
Please add a NEWS entry.
There is other bug in the Python implementation. It is easy to fix, but you can prefer to open a separate issue for it, because the test will be more complicated (__buffer__() needs to return different values).
| n = view.nbytes # Size of any bytes-like object | ||
| if self.closed: | ||
| raise ValueError("write to closed file") | ||
| if n == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rest of the code should be moved in the with block.
Otherwise writing to BytesIO after seeking can not just overwrite existing bytes, but move bytes after the current position.
This is a separate bug, so you can open a separate issue and fix it in a separate PR.
| # Prevent crashes when memio.write() concurrently exports 'memio'. | ||
| # See: https://github.com/python/cpython/issues/143378. | ||
| class B: | ||
| buf = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not needed if you never read it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we should just have memio.getbuffer()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to save a reference to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, so _ = memio.getbuffer()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to keep a reference after returning from __buffer__().
Initially I though about a local variable in the outer scope (an initialization would be needed in that case), but an attribute of the written object works as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right. Sorry. Yes an attribute would be just the best :')
PyObject_GetBuffer() can execute user code (e.g. via buffer), which may close or otherwise mutate a BytesIO object while write() or writelines() is in progress. This could invalidate the internal buffer and lead to a use-after-free.
Temporarily bump the exports counter while acquiring the input buffer to block re-entrant mutation, and add regression tests to ensure such cases raise BufferError instead of crashing.
_io.BytesIO.writelinesvia re-entrant__buffer__close #143378