Rust Closures: Returning `impl Fn` for `move` closures

栏目: IT技术 · 发布时间: 5年前

Problem setup

Recently I've had to write some nom code (it is a parser combinator library for Rust). To my surprise I discovered that there is no combinator for creating a parser which always succeeds returning a certain given value. At least not without using macros which is discouraged in nom v5. That combinator would be something like pure in Haskell Parsec. It's not very useful on its own, but can be used as part of other combinators, say providing a default alternative for alt .

So I decided to add success to nom. After looking at the library code, I realised that it uses closures quite heavily and I didn't use them much in Rust, so I had some questions. Here is my version of success basically copy-pasted from a similar combinator value :

pub fn success > , O : Clone , E : ParseError > ( val : O ) -> impl Fn ( I ) -> IResult

{

move | input : I | {

Ok ( ( input , val . clone ( ) ) )

}

That type signature looks a little scary, eh? However, since we are not going to focus on I or E arguments here (input and error types), we can just rewrite it like this, omitting irrelevant details:

pub fn success ( val : O ) -> impl Fn ( I ) -> IResult

{

move | input : I | {

Ok ( ( input , val . clone ( ) ) )

}

Questions

I had three questions here:

Why do we need to clone val ? After all it looks like l I have a value and just want to pass the ownership to the parser, no need to clone anything.
Why we have move closure, but return type of the function is impl Fn(something) and not impl FnOnce(something) I thought that when we use move then we move the captured environment into the closure and FnOnce trait matches that behaviour.
Can we omit move or change the type to FnOnce or remove Clone i.e. to remove any of those things which I didn't understand and still make it work? Are they actually necessary?

TL;DR move determines how captured variables are moved into the returned closure. Then returned impl Fn/FnMut/FnOnce puts restrictions on how they are used inside that closure (which in turn defines whether the closure can be used once or more). We can move into closure but still only use the captured values by reference and return impl Fn to allow multiple calls of the returned closure. And yes, everything in the code above was necessary :)

More detailed answers

I assume here that you know the basics about closures. If not, you can read a corresponding chapter in the Rust book. Also, on top of that I would recommend reading Steven Donovan's post "Why Rust Closures are (Somewhat) Hard" .

That post (and Rust reference) tells you that a closure basically corresponds to an anonymous structure of some concrete but unknown type which has fields corresponding to the captured variables. The capture mode of those fields (i.e. whether they are &T , &mut T or T ) is determined by the usage of the captured variables inside the closure. Or it can be forced to be T , i.e. to passing the ownership to the closure, by using move keyword.

I'll repeat the code above for your convenience:

pub fn success ( val : O ) -> impl Fn ( I ) -> IResult

{

move | input : I | {

Ok ( ( input , val . clone ( ) ) )

}

So, in our case we implicitly have the following structure for our move closure, it only captures a single variable val of generic type O :

struct ClosureType < O > {

// where it's O, &O or &mut O is defined by

// `val` usages in the closure and presence of `move` keyword

val : O ;

}

Then we know that this closure should implement Fn trait, since this is what is returned from success function. As described in the documentation , it will look like this (note the helpful comment):

impl Fn for ClosureType < O > {

type Output = IResult ;

// This `&self` here is because we implement Fn, not FnOnce!

// In FnOnce it would be `self`, owning the structure fields.

fn call ( & self , input : I ) -> Output {

Ok ( ( input , self . val . clone ( ) ) ) ;

}

Now we can answer our question 1 : why do we have to clone? If we didn't clone, we would be moving out of self.val which is behind a reference. So we can't do that for Fn .

Another way of resolving this issue (i.e. if we don't want to clone) would be to use FnOnce as the result type which would in effect give us self instead of &self in the call method, so we can pass the ownership further with Ok((input, self.val)) . However, using FnOnce means that we can use the resulting closure just once, perhaps that's not we want in a parser. Why? I don't know for sure, but I suspect that we may want to do certain lookaheads while parsing and this means invoking the closure and then backtracking if it doesn't match. But this mean we would have spoiled the parser closure and can't use it again for another run. So let's assume that our parsers should be Fn s which is indeed Nom's API.

The question 2: how move can coexist with returning Fn is also clear now. move determines how values are captured into that closure-structure, i.e. which type they have there (refs, mutable refs or owned values) while Fn/FnOnce/FnMut trait is determined by the way they are used in that closure (in our call method above). For example, we can move into a closure but still only use the captured values by reference and return impl Fn to allow multiple calls of the returned closure.

The remaining part of question 3 is why we want to use move closure here if we only access the variable by reference anyway. The explanation is that we are returning this closure, but val will be dropped immediately on return and the closure can't outlive it. In other words, if we didn't use move , we would have val: &'a O in that closure structure field. But that reference would immediately become invalid since when we return our closure, val is dropped, so no references to that are allowed, including inside our returned closure. So that won't work and we would get "val does not live long enough" error message.

Therefore, it looks like all the bells and whistles in that success combinator were actually needed.

Summary

I think the main conclusion of this investigation is that explicitly writing out implicit structures and Fn/FnMut/FnOnce implementations is very useful for making sense of compiler errors and understanding what's going on under the hood in the world of closures. At least for the first time.

创业无畏

彼得· 戴曼迪斯、史蒂芬· 科特勒 / 贾拥民 / 浙江人民出版社 / 2015-8 / 69.90元

 您是否有最大胆的商业梦想？您是否想把一个好主意快速转化为一家市值几百亿甚至几千亿元的公司？《创业无畏》不仅分享了成功创业家的真知灼见，更为我们绘制了一幅激情创业的行动路线图！  创业缺人手怎么办？如何解决钱的问题？把握指数型大众工具，互联网就是你车间，你的仓库。拥有好的创意，自然有人把钱“白白地送给你用”。当你大海捞针的时候，激励性大奖赛会让针自己跑到你的眼前来！  掌握指数级......一起来看看《创业无畏》这本书的介绍吧!

码农工具

Rust Closures: Returning `impl Fn` for `move` closures

创业无畏

图片转BASE64编码

Markdown 在线编辑器

HEX HSV 转换工具