core::str::{chars_uppercase,chars_lowercase} iterators#98490
core::str::{chars_uppercase,chars_lowercase} iterators#98490markokr wants to merge 1 commit intorust-lang:masterfrom
Conversation
|
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
|
|
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @thomcc (or someone else) soon. Please see the contribution instructions for more information. |
08b407c to
5e2185a
Compare
|
This is a new API, so please file an API change proposal (see the links in the bot post). That said, I'm very much in favor of this, and a quick skim (will need more thorough review once it gets API sign off) looks fine. @rustbot label +T-libs-api -T-libs |
This comment has been minimized.
This comment has been minimized.
5e2185a to
5a32cba
Compare
|
Thanks for quick feedback! Created API proposal here: rust-lang/libs-team#58 |
This comment has been minimized.
This comment has been minimized.
5a32cba to
38404ad
Compare
This comment has been minimized.
This comment has been minimized.
They are based on new UnicodeConverter + UnicodeIterator internal API that supports context-sensitivity and char expansion.
e1fbdd0 to
f5d701f
Compare
There was a problem hiding this comment.
Hmm, the implementation here is more generic than it needs to be, and more complex as a result. You should not do this via a trait IMO -- The way you've decoupled it here also requires more bounds-checks which shouldn't be needed, and avoiding them in this interface would require otherwise-unnecessary unsafe code (to be clear: I'm not asking for you to add unsafe to this).
I think chars_lowercase should be a bit more like what you'd get by manually performing s.chars().flat_map(|c| c.to_lowercase()). Uppercase is the complex one, but should be handled directly, closer to the implementation in liballoc.
Let me know if you need an example of what I mean.
| /// | ||
| /// Default implementation is pass-through, no conversion is done, | ||
| /// with `is_simple = is_ascii`. | ||
| trait UnicodeConverter { |
There was a problem hiding this comment.
I think this shouldn't use a trait. I think this ends up complicating things quite a bit and makes the implementation overly generic.
| // data source | ||
| iter: CharIndices<'a>, | ||
| // buffer for .next() | ||
| fwd: [Option<char>; 2], |
There was a problem hiding this comment.
nit: forward/backward instead of fwd/bwd.
| #![feature(unicode_internals)] | ||
| #![feature(unsize)] | ||
| #![feature(std_internals)] | ||
| #![feature(unicode_converter)] |
There was a problem hiding this comment.
This feature name is too generic, perhaps something more like #![feature(str_chars_casemapped)]?
|
☔ The latest upstream changes (presumably #100644) made this pull request unmergeable. Please resolve the merge conflicts. |
|
Ping from triage: |
They are based on new UnicodeConverter + UnicodeIterator
internal API that supports context-sensitivity and char expansion.
API change proposal: rust-lang/libs-team#58