-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
ptx: handle invalid regex arguments gracefully instead of panicking #9825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
d829547 to
30a281f
Compare
30a281f to
07c63a1
Compare
07c63a1 to
237d483
Compare
|
GNU testsuite comparison: |
| filter: &WordFilter, | ||
| file_map: &FileMap, | ||
| ) -> UResult<BTreeSet<WordRef>> { | ||
| let reg = Regex::new(&filter.word_regex) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a branch with all of your ptx PR's to see if it passed the GNU tests and this part failed since when an invalid regex is provided its expected to print to stderr and return empty with a success code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To solve this I tested with a:
/// Try to compile a regex, printing a warning and returning None on failure.
/// This matches GNU ptx behavior which handles invalid regex gracefully.
fn try_compile_regex(pattern: &str) -> Option<Regex> {
Regex::new(pattern)
.inspect_err(|e| show_error!("{}", translate!("ptx-error-invalid-regexp", "error" => e)))
.ok()
}
So that the error handling for the create word set looks like this:
let Some(reg) = try_compile_regex(&filter.word_regex) else {
return Ok(BTreeSet::new());
};
let Some(ref_reg) = try_compile_regex(&config.context_regex) else {
return Ok(BTreeSet::new());
};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually even though this will pass the GNU tests, I am not finding any cases where the Regex fails to compile and it provides a warning, will mainly just need to succeed those with an empty list
| let mut file_map: FileMap = HashMap::new(); | ||
| let mut offset: usize = 0; | ||
|
|
||
| let sentence_splitter = if let Some(re_str) = &config.sentence_regex { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same idea here:
let sentence_splitter = config
.sentence_regex
.as_ref()
.and_then(|re_str| try_compile_regex(re_str));
In GNU the expectation is to succeed while printing the error to stderr
| fn test_invalid_regex_word_unclosed_group() { | ||
| new_ucmd!() | ||
| .args(&["-W", "(wrong"]) | ||
| .fails_with_code(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only other follow up would be to change these to reflect the success message, but the error message can stay the same
|
Whats going to be tricky here is that some of the testing to compare the two implementations is going to not work because whats considered invalid to rust's regex does not line up 100% with the GNU regex. Some examples: bar\ and [abc GNU allows trailing backslashes, but uutils would show a warning for bar\ whereas if theres an unmatched [ it fails completely for gnu ptx but would succeed with an error for the uutils implementation. I think for the time being it would be best to just have this as an open issue to track some of the edge cases and merge the PR's you have with the fixes that pass all of the test cases and then the deviation with GNU can be handled when uutils figures out what it wants to do about the whole regex library situation |
Description
This PR fixes a panic in
ptxwhen an invalid regular expression is passed to the-W(--word-regexp) argument.Previously, running:
ptx -W 'bar\\\'would result in a Rust panic (
called Result::unwrap() on an Err value).This change catches the
regex::Error, wraps it in aUSimpleError, and ensures the program exits with code 1 and prints a user-friendly message to stderr.Testing
tests/by-util/test_ptx.rsto verify:bar\) return exit code 1.(wrong) return exit code 1.