Building An Intuition for Pattern Matching

栏目: IT技术 · 发布时间: 3年前

内容简介:What's the point of pattern matching if we already have conditionals and variable assignment in a language?Pattern matching helps tease apart values and construct control flow using the shape of data rather than bespoke logic, methods on types, or special

What's the point of pattern matching if we already have conditionals and variable assignment in a language?

Pattern matching helps tease apart values and construct control flow using the shape of data rather than bespoke logic, methods on types, or special fields on a struct. For example, in languages that don't have first-class support for sum types, enums in Rust, you'd have to encode the variant as a unique tag on something like a struct, e.g.,

struct Option {
    tag: String, // maybe one of 'Some' or 'None'.
    // and so on.
}

Then the tag field can be checked by traditional control flow. This is precisely how it is done in languages like TypeScript, but in Rust, where sum types are supported, we have no unique tag field to check and, since the compiler hides this information away from us, we can't write a method to describe which variant we have in our hands. It would be a bit clumsy if the compiler generated methods for us as we might want to have methods with the same name!

Any kind of syntactic sugar used to construct a value is known as a constructor , such as building values for structs, enums, tuples, and so on. Pattern matching gives us a way to describe the shape of data using constructors to match on and what to do if the value matches. This analogy isn't perfect, but I like to think of patterns as mirrors with outlines; if the reflection matches the outline of a constructor, we go down that path of logic, possibly with some new values drawn out of the data. Here are some common patterns for constructors:

pub struct S {
    field: i64,
}

pub enum E {
    FirstVariant,
    SecondVariant,
}

pub fn main() {
    // Tuples.
    let a = ("Fizz", "Buzz");
    match a {
        (p, q) => println!("{}", format!("{}{}", p, q)),
    }

    // Numeric literals.
    let b = 123;
    match b {
        std::i32::MIN..=99 => println!("under one-hundred"),
        100 => println!("exactly one-hundred"),
        101..=std::i32::MAX => println!("above one-hundred"),
    }

    // Strings.
    let c = "A string.";
    match c {
        "A string." => println!("it's _the_ string."),
        _ => println!("some other string."),
    }

    // Enums.
    let x = E::SecondVariant;
    match x {
        E::FirstVariant => println!("first variant of E"),
        E::SecondVariant => println!("second variant of E"),
    }

    // Structs.
    let y = S { field: 100 };
    match y {
        S { field } => println!("field is: {}", field),
    }

    // Slices.
    let z = vec![1, 2, 3];
    match *z { // we need * to dereference Vec to a slice.
        [a, b] => println!("{} + {} = {}", a, b, a + b),
        [a, b, c] => println!("{} + {} * {} = {}", a, b, c, a + b * c),
        _ => println!("any other unmatched vector"),
    }
}

Playground

Where can we put patterns?

match is the traditional way of doing pattern matching but not the only way. Matches work top-to-bottom, and they ensure that every case is handled, known as exhaustivity checking .

enum Val {
    Integer(i64),
    Float(f64),
}

match {
    Val::Integer(x) => println!("It's an integer: {}", x), // one "arm" or "case"
    // without anything else, this is non-exhaustive; it doesn't include Val::Float!
}

which fails to compile with the following error:

error[E0004]: non-exhaustive patterns: `Float(_)` not covered
 --> src/main.rs:8:11
  |
1 | / enum Val {
2 | |     Integer(i64),
3 | |     Float(f64),
  | |     -- not covered
4 | | }
  | |_- `Val` defined here
...
8 |       match v {
  |             ^ pattern `Float(_)` not covered
  |
  = help: ensure that all possible cases are being handled, possibly by adding wildcards or more match arms

error: aborting due to previous error

Playground

Pattern matches in let s and function arguments will also work but must be irrefutable , which is a fancy way of saying that the pattern can never fail. Any pattern that covers all possible values of a type is irrefutable. It could be literal like a range or with a variable, which will always capture a value and, therefore, match.

// works.
pub fn f((x, y): (i32, i32)) -> i32 {
    x + y
}

// does not work.
//pub fn g((1, 2): (i32, i32)) {
// fails on anything other than g(1, 2).
// the compiler rejects this as a refutable pattern
// which is in place where only an irrefutable pattern can be.
//}

pub fn main() {
    //let 12 = 12; // fails.
    let x = 12; // succeeds.
    f((x, x));
    let std::i32::MIN..=std::i32::MAX = 12; // succeeds, covers all values.
}

Playground

With functions, this is a bit different from other functional languages like Erlang or Haskell. In those languages, you can write multiple function declarations, each with their pattern match, and the function that matches the pattern will be the one that executes. You can think of this like match expressions but for functions! Rust, unfortunately doesn't have this, but it's still fine to take the full value from the argument and make the entire function body a match . So this in Elixir:

def f(1) do
  // first case.
end

def f(2) do
  // first case.
end

def f(x) do
  // final, irrefutable case.
end

could be expressed as:

pub fn f(x) {
    match x {
        1 => , // first case.
        2 => , // second case.
        x => , // final, irrefutable case.
    }
}

Isn't this a bit tedious? What if you don't care about particular portions of a shape?

Ignoring particular values is easy with the _ variable, or we can prefix a variable name with _ if we want to keep the name but ensure it can't be used. This is formally known as a wildcard , but informally known as the "don't care" variable. The equivalent for structs is .. where we can specify only the fields we care about and ignore the rest. These two dots, a bit like an ellipse, must be mentioned in the last place of the struct.

struct S {
    field: i32,
    property: (i32, i64),
}

pub fn main() {
    let s = S {
        field: 42,
        property: (12, 13),
    };
    match s {
        S { property: (12, _), .. } => println!("{}", 12),
        S { field, .. } => println!("{}", field),
        // or `S { field, property: _ } => println!("{}", field),`
    };
}

Playground

What if I want to describe some nested shape, but match on the whole thing?

To do this you can use @ in front of the pattern, known informally as the "as-pattern". As of this writing, binding both the whole pattern plus parts of the pattern isn't allowed.

#[derive(Debug)]
struct S {
    field: i32,
}

pub fn main() {
    let s = S { field: 42 };
    match s {
        S { field: x @ 10..=100, } => println!("{:?}", x),
        S { field } => println!("{}", field),
    };
}

Playground

What if you don't want to specify literals or bind to variables?

If you want to do more complicated checking on bound variables, you can use a match guard. A guard is introduced with an if after the pattern, but before the fat arrow => , and the resulting value must be a boolean value, as would be the case for other conditionals. You can't use guards on let and function argument patterns.

#[derive(Debug)]
struct S {
    field: i32,
}

pub fn main() {
    let s = S { field: 42 };
    match s {
        S { field } if field % 2 == 0 => println!("only executes when field is even"),
        _ => println!("all remaining values go here"),
    };
}

Playground

What about cases where you might want to combine several possible patterns into one match arm?

You can combine patterns using what is known as an or-pattern by using a | to try several patterns in a row. This way you can compress several patterns into one match arm.

enum Enum {
    A,
    B,
}

pub fn main() {
    let x = Enum::A;
    match x {
        Enum::A | Enum::B => println!("matches"),
    };

    // or possibly in an if/while-let pattern match.
    let x = Some(12);
    if let Some(13) | Some(12) = x {
        println!("works");
    }
}

Playground

What if I want to check a pattern but I don't want all of the machinery of a match statement?

The matches! macro lets us write a test to see if a supplied pattern will match a given value. The macro doesn't allow you to bind values, but it can allow you to extend a pattern using guards which is another handy use I've found for it (see the quirks later for more details on a precise application).

pub fn main() {
    assert_eq!(matches!(12, std::i32::MIN..=100), true);
    assert_eq!(matches!(None, Some(42)), false);
}

Playground

Conclusion

And now you know the bulk there is to pattern matching in Rust! As a recap, patterns are tested on a value, and if they line up, they will execute some branch of logic or bind some values to identifiers, or both! You can check complicated logic with guards, ignore portions of patterns with wildcards, bind whole matches with as-patterns, combine patterns with or-patterns, and test for pattern matches with the matches! macro. You also can use patterns in a number of places outside of match and the relevant control flow expressions such as in let bindings and function arguments.

Quirks

These quirks are more around ergonomic uses of patterns rather than any dealbreakers for writing production-grade code. You can happily skip this section if you are still processing the information from above.

First up, nested or-patterns or in other locations, such as function arguments, are unstable and require the #![feature(or_patterns)] attribute. Another way around the nested or-patterns is to use the matches! macro in a guard:

#[derive(Debug)]
struct Container(Possibly);

#[derive(Debug)]
enum Possibly {
    A,
    B,
}

fn main() {
    let container = Container(Possibly::A);
    match container {
        // Container(Possibly::A | Possibly::B) => // won't work
        Container(inner) if matches!(inner, Possibly::A | Possibly::B) => {
            dbg!(inner);
        }
        _ => {
            dbg!("won't happen unless Possibly changes");
        }
    };
}

Playground

Exclusive ranges for matching against numbers that aren't literals can be enabled with #![feature(exclusive_range_pattern)] . As it stands, you can only express inclusive ranges:

fn main() {
    let std::i32::MIN..=std::i32::MAX = 12; // works.
    //let std::i32::MIN..std::i32::MAX = 12; // refuses to compile.
}

And lastly, bindings after @ aren't supported unless you turn them on with #![feature(bindings_after_at)] . This is a bit tricky anyway given ownership and borrowing semantics and how that plays into binding both the top-level value and the values inside of them.

#![feature(bindings_after_at)]

#[derive(Debug)]
struct S {
    field: (i32, i32),
}

fn main() {
    let x = S { field: (1, 2) };
    match x {
        S {
            field: tuple @ (ref a, ref b),
        } => println!("{:?}, {} + {} = {}", tuple, a, b, a + b),
    }
}

Playground


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

深入应用C++11

深入应用C++11

祁宇 / 机械工业出版社 / 2015-5 / 79

在StackOverflow的最近一次世界性调查中,C++11在所有的编程语言中排名第二, C++11受到程序员的追捧是毫不意外的,因为它就像C++之父Bjarne Stroustrup说的:它看起来就像一门新的语言。C++11新增加了相当多的现代编程语言的特性,相比C++98/03,它在生产力、安全性、性能和易用性上都有了大幅提高。比如auto和decltype让我们从书写冗长的类型和繁琐的类型......一起来看看 《深入应用C++11》 这本书的介绍吧!

随机密码生成器
随机密码生成器

多种字符组合密码

Base64 编码/解码
Base64 编码/解码

Base64 编码/解码

XML 在线格式化
XML 在线格式化

在线 XML 格式化压缩工具