-
Notifications
You must be signed in to change notification settings - Fork 214
Optimise new encoding functions #1306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
The decode methods use methods from other encodings. For example, Utf8::decode calls Utf16::getCharCount and Utf16::encode in a loop. Placing them in the same file makes it easier for them to be inlined which improves performance.
The existing Utf8::encode function takes in a buffer, but we don't know what size is required so we have to iterate through the string before writing to make sure the buffer is big enough. If the caller already ran getByteCount, then this means we have duplicated their work just for the case where they did not do it properly. This new method allocates its own buffer that is definitely the right size, which avoids the need for unnecessary checks
|
Just did a quick bit of benchmarking with a "lorem ipsum" like file full of chinese characters. With these changes 10k iterations:
10k iterations: This was my benchmark code for reference. class Main {
static function main() {
// final bytes = File.getBytes("C:\\Users\\AidanLee\\Desktop\\test\\lorem.txt");
final string = File.getContent("C:\\Users\\AidanLee\\Desktop\\test\\lorem.txt");
Timer.measure(() -> {
for (_ in 0...10000) {
final _ = Bytes.ofString(string);
// final _ = bytes.toString();
}
});
}
} |
|
On length estimation, the one downside of this would be strings which are actually just under the "large object" size threshold of 2k bytes but are estimated to be higher. In this case those strings would go through the large object allocator rather than the thread local one. |
Yeah, this is where the tradeoff lies.
For some reason, it's the other way around for me: ofString
toString
|
Hopefully this counteracts the performance regression seen after they started being used for haxe.io.Bytes, see: #1294 (comment)
This is lighter than #1302 since it doesn't introduce an external library. It also addresses the problem from a different angle, by avoiding inefficiencies in the surrounding code, rather than making the encoding/decoding itself as fast as possible. Even if we decide to use simdutf I think it would be worth making some of these changes anyway, as some of them would also benefit the simdutf implementation.
These are the changes:
With this I see an improvement locally in the dox benchmark compared to master (1.3% vs 5.5% time spent in Bytes.ofString).
Expand for flamegraphs
Download as svg and use search bar to search for "Bytes_obj::ofString".
Unpatched:
Patched: