Skip to content

APC: split_pattern on slices #457

@eduardorittner

Description

@eduardorittner

Proposal

Add a split_pattern on slices, which is similar to split on slices but takes a slice instead of a closure, and splits the first slice on every instance of the provided pattern slice.

Problem statement

Inspiration was taken from rust-lang/rust#49036, which basically suggests extending the existing String::split to a more generic slice::split_pattern which was implemented for any T: PartialEq, and maybe a Vec::split_pattern as well.

Motivating examples or use cases

From the original issue, suppose you have a Vec<u8> of non-UTF8 data and you want to split on newlines, it would be very nice to be able so simply my_vec.split_pattern(b"\n"), instead of my_vec.split(|x| *x == b'\n'). The closure on split is way clunkier when you want to match on multi-element patterns, since split runs the closure for one given element, not a slice.

Solution sketch

I didn't know about APCs, so I made a PR before this with an idea of how it could be implemented for slices, with the following struct:

pub struct SplitPattern<'a, 'b, T>
where
    T: cmp::PartialEq,
{
    v: &'a [T],
    pattern: &'b [T],
    finished: bool,
}

and the most important method:

impl<'a, 'b, T> Iterator for SplitPattern<'a, 'b, T>
where
    T: cmp::PartialEq,
{
    type Item = &'a [T];

    #[inline]
    fn next(&mut self) -> Option<&'a [T]> {
        if self.finished {
            return None;
        }

        for i in 0..self.v.len() {
            if self.v[i..].starts_with(&self.pattern) {
                let (left, right) = (&self.v[0..i], &self.v[i + self.pattern.len()..]);
                let ret = Some(left);
                self.v = right;
                return ret;
            }
        }
        self.finish()
    }
}

next_back would be implemented similarly using ends_with instead of starts_with.

The implementation for Vec would be pretty similar I think.

Alternatives

I read about SlicePattern, however the source comments said something about generalising core::str::Pattern so I wasn't sure if I should use it or not, and I also thought that doing that would be a little out of my range.

Links and related work

Metadata

Metadata

Assignees

No one assigned

    Labels

    T-libs-apiapi-change-proposalA proposal to add or alter unstable APIs in the standard libraries

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions