Understanding a bit more of Rust’s type system
In particular: &str
and Sized
Take a look at this Iterator
expression in Rust:
let strings = vec!["foo", "foobar", "bar"];
let filtered: Vec<_> = strings.into_iter()
.filter(|string| string.starts_with("foo"))
.collect();
Okay, so I got it into my head that I don’t like closure syntax in Rust. The |
pipes around the argument list annoyed me yesterday. I prefer the JS/TS way:
let fnSingleArg = string => string.startsWith("foo");
let fnMultiArgs = (string, prefix) => string.startsWith(prefix);
What I’d like to see in Rust might be something similar:
let fn_single_arg = string => string.starts_with("foo");
let fn_multi_args = string, prefix => string.starts_with(prefix);
but also weirder:
// This, i.e. `(a, b)` is a _single_ arg, and it's being unpacked.
let fn_unpack_arg = (a, b) => a + b;
I don’t know exactly; I haven’t given it too much thought. What I want is to optimise for ergonomics and aesthetics1 around single arguments, including unpacking single arguments, because in the code I write I am predominantly writing closures with single arguments, e.g. for use with Iterator
.
That’s the preamble to what I really want to talk about. You see, I thought the iterator example would look nicer with a free starts_with
function, which itself returns a function:
let filtered: Vec<_> = strings.into_iter()
.filter(starts_with("foo"))
.collect();
Maybe it does look and read better, but my real subject is what I learned by implementing starts_with
and how it turned out to be more complex than I could have imagined.
First attempt
pub fn starts_with<T: AsRef<str>>(prefix: &str) -> impl Fn(T) -> bool + '_ {
move |s| s.as_ref().starts_with(prefix)
}
The AsRef<str>
bit means I can call it with &str
or String
or &String
or indeed anything else that implements AsRef<str>
.
But using it in our example iterator expression doesn’t work:
error[E0599]: the method `collect` exists for struct `Filter<IntoIter<&str>, impl Fn(&&str) -> bool>`, but its trait bounds were not satisfied
--> src/strings.rs:10:5
|
10 | let filtered: Vec<_> = strings.into_iter().filter(starts_with_foo).collect();
| ^^^^^^^ method cannot be called on `Filter<IntoIter<&str>, impl Fn(&&str) -> bool>` due to unsatisfied trait bounds
|
::: …/core/src/iter/adapters/filter.rs:19:1
|
19 | pub struct Filter<I, P> {
| ----------------------- doesn't satisfy `_: Iterator`
|
= note: the following trait bounds were not satisfied:
`<impl Fn(&&str) -> bool + '_ as FnOnce<(&&str,)>>::Output = bool`
which is required by `std::iter::Filter<std::vec::IntoIter<&str>, impl Fn(&&str) -> bool + '_>: Iterator`
`impl Fn(&&str) -> bool + '_: FnMut<(&&str,)>`
which is required by `std::iter::Filter<std::vec::IntoIter<&str>, impl Fn(&&str) -> bool + '_>: Iterator`
`std::iter::Filter<std::vec::IntoIter<&str>, impl Fn(&&str) -> bool + '_>: Iterator`
which is required by `&mut std::iter::Filter<std::vec::IntoIter<&str>, impl Fn(&&str) -> bool + '_>: Iterator`
Forcing myself to slow down and read the error message (5 or 6 times) leads me to read the code for the Iterator::filter
trait method:
fn filter<P>(self, predicate: P) -> Filter<Self, P>
where
Self: Sized,
P: FnMut(&Self::Item) -> bool,
{
Filter::new(self, predicate)
}
and it dawns on me that starts_with
is returning the wrong type of function. Notice P
up there: FnMut(&Self::Item) -> bool
. It wants a reference 🤔
Second go
pub fn starts_with<T: AsRef<str>>(prefix: &str) -> impl Fn(&T) -> bool + '_ {
move |s| s.as_ref().starts_with(prefix)
}
Note the changed return type: the function expects a reference.
This works! It compiles in the iterator expression. This is not the time for celebrations though because, weirdly, there’s a hitch when using it as a free function. Passing in explicit references, i.e. &&str
and &String
, is okay:
starts_with("foo")(&"foobar");
starts_with("foo")(&"foobar".to_string());
But passing in a &str
is not okay:
error[E0277]: the size for values of type `str` cannot be known at compilation time
--> src/strings.rs:11:5
|
11 | starts_with("foo")("foobar");
| ^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time
|
= help: the trait `Sized` is not implemented for `str`
note: required by an implicit `Sized` bound in `starts_with`
--> src/strings.rs:3:20
|
3 | pub fn starts_with<T: AsRef<str>>(prefix: &str) -> impl Fn(&T) -> bool + '_ {
| ^ required by the implicit `Sized` requirement on this type parameter in `starts_with`
help: consider relaxing the implicit `Sized` restriction
|
3 | pub fn starts_with<T: AsRef<str> + ?Sized>(prefix: &str) -> impl Fn(&T) -> bool + '_ {
| ++++++++
Relaxing the Sized
constraint works – thank you, rustc
🙏 – and gives us:
pub fn starts_with<T: AsRef<str> + ?Sized>(prefix: &str) -> impl Fn(&T) -> bool + '_ {
move |s| s.as_ref().starts_with(prefix)
}
This does everything I want. I can celebrate! 🎉🥳
Yet something still bothers me: why did I need to do that + ?Sized
thing?
That implicit Sized
constraint
The Rust Programming Language says, in Dynamically Sized Types and the Sized
Trait:
Let’s dig into the details of a dynamically sized type called
str
, which we’ve been using throughout the book. That’s right, not&str
, butstr
on its own, is a DST. We can’t know how long the string is until runtime, meaning we can’t create a variable of typestr
, nor can we take an argument of typestr
.
and later on:
By default, generic functions will work only on types that have a known size at compile time. However, you can use the following special syntax to relax this restriction:
fn generic<T: ?Sized>(t: &T) { // --snip-- }
A trait bound on
?Sized
means “T
may or may not beSized
” and this notation overrides the default that generic types must have a known size at compile time. The?Trait
syntax with this meaning is only available forSized
, not any other traits.Also note that we switched the type of the
t
parameter fromT
to&T
. Because the type might not beSized
, we need to use it behind some kind of pointer. In this case, we’ve chosen a reference.
(Emphasis is mine.)
Understanding?
I feel like I understand the reasoning now, and I could apply it again in another situation, but my underlying mental model remains hazy. The kinds of unresolved thoughts going round my head:
-
I’ve read that
&str
is a fat pointer: a pointer to the data and the length of the string slice. Does this mean thatBox<str>
is also a pointer plus a size? -
Why use
&str
instead ofBox<str>
, or vice-versa? I assume it has something to do withBox
always referring to heap allocations, whereas&str
can point to heap, stack, compiled-in strings, etc. -
The compiler knows the size of literal strings at compile time, so why are they
&str
? -
Is
&str
/str
treated specially by the compiler?
And that is where I will leave this topic – for today.
Yes, aesthetics. I want to read code easily, and something like .map(|(foo, bar)| { foo().unwrap_or(bar); ... })
is noisy and hard to parse (with my brain 🧠). Subjectively ugly too, but Rust has a smidge of airport prettiness to it anyway.