-
Notifications
You must be signed in to change notification settings - Fork 23
cmp: stop allocating for byte printing #153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #153 +/- ##
==========================================
+ Coverage 86.25% 86.27% +0.01%
==========================================
Files 13 13
Lines 6222 6228 +6
Branches 519 514 -5
==========================================
+ Hits 5367 5373 +6
Misses 854 854
Partials 1 1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
src/cmp.rs
Outdated
|
|
||
| // SAFETY: the checks and shifts we do above match what cat and GNU | ||
| // Add right-padding spaces | ||
| let padding = width.saturating_sub(display_width); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could use extend_from_slice with a constant array instead of a loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call! This and the other comment made me realize I also don't need to get a parameter here at all for width - I initially did that when investigating the possibility of sharing code with other tools, but for cmp it's always 4.
| quoted.push(b'M'); | ||
| quoted.push(b'-'); | ||
| byte -= 128; | ||
| fn write_visible_byte(output: &mut Vec<u8>, byte: u8) -> usize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
write_visible_byte returns usize, but write_visible_byte_padded returns ()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because we need to know how many bytes were actually written by write_visible_bytes() in write_visible_byte_padded() to know how many spaces we need to pad with. But other than that we should not need the return value. If we eventually make this more generic we could make both return usize, but otherwise I don't think there is a need?
I looked into how much I could share between cmp, cat and sort, but ended up with too little shared to justify the complexity of moving stuff around and adding a dependency to uucore, btw.
|
some minor suggestions |
This makes verbose comparison of 37MB completely different files 2.34x
faster than our own baseline, putting our cmp at almost 6x faster than
GNU cmp (/opt/homebrew/bin/cmp) on my M4 Pro Mac. The output remains
identical to that of GNU cmp. Mostly equal and smaller files do not
regress.
Benchmark 1: ./bin/baseline/diffutils cmp -lb t/huge t/eguh
Time (mean ± σ): 1.669 s ± 0.011 s [User: 1.594 s, System: 0.073 s]
Range (min … max): 1.654 s … 1.689 s 10 runs
Warning: Ignoring non-zero exit code.
Benchmark 2: ./target/release/diffutils cmp -lb t/huge t/eguh
Time (mean ± σ): 714.2 ms ± 4.1 ms [User: 629.3 ms, System: 82.7 ms]
Range (min … max): 707.2 ms … 721.5 ms 10 runs
Warning: Ignoring non-zero exit code.
Benchmark 3: /opt/homebrew/bin/cmp -lb t/huge t/eguh
Time (mean ± σ): 4.213 s ± 0.050 s [User: 4.128 s, System: 0.081 s]
Range (min … max): 4.160 s … 4.316 s 10 runs
Warning: Ignoring non-zero exit code.
Benchmark 4: /usr/bin/cmp -lb t/huge t/eguh
Time (mean ± σ): 3.892 s ± 0.048 s [User: 3.819 s, System: 0.070 s]
Range (min … max): 3.808 s … 3.976 s 10 runs
Warning: Ignoring non-zero exit code.
Summary
./target/release/diffutils cmp -lb t/huge t/eguh ran
2.34 ± 0.02 times faster than ./bin/baseline/diffutils cmp -lb t/huge t/eguh
5.45 ± 0.07 times faster than /usr/bin/cmp -lb t/huge t/eguh
5.90 ± 0.08 times faster than /opt/homebrew/bin/cmp -lb t/huge t/eguh
This makes verbose comparison of 37MB completely different files 2.34x faster than our own baseline, putting our cmp at almost 6x faster than GNU cmp (/opt/homebrew/bin/cmp) on my M4 Pro Mac. The output remains identical to that of GNU cmp. Mostly equal and smaller files do not regress.