Rust - Basic - 14 - Smart Pointers

一月 3, 2022

Smart Pointers

Comprehension

实现了 Deref 和 Drop trait 的类型可以作为智能指针类型

Box

三种使用 Box 的情况

当一个类型在运行期 size 不确定时
当数据过大需要避免 transfer ownership 带来的 copy 开销时
想 own a value 但不关心具体类型，只关心特定trait时

下图为 recursive type 在 Rust 编译器的内存推断简图，最终结果是无法通过编译

// 以下代码会编译失败
enum List {
    Cons(i32, List),
    Nil,
}

use crate::List::{Cons, Nil};

fn main() {
    let list = Cons(1, Cons(2, Cons(3, Nil)));
}

此时可以使用 Box ，编译器就能推断出相关类型的内存占用大小

enum List {
    Cons(i32, Box<List>),
    Nil,
}

use crate::List::{Cons, Nil};

fn main() {
    let list = Cons(1, Box::new(Cons(2, Box::new(Cons(3, Box::new(Nil))))));
}

Deref & Drop

当一个类型变量作为函数参数传入时，如果参数类型与该变量类型不一致，那么 Rust 编译器将会尝试强制解引用（即调用 Deref trait 的 deref 方法）

Deref coercion happens automatically when we pass a reference to a particular type’s value as an argument to a function or method that doesn’t match the parameter type in the function or method definition.

Drop trait 的 drop 方法，会在 value goes out of scope 的时候调用（FILO）

drop 方法不允许被主动调用

struct CustomSmartPointer {
    data: String,
}

impl Drop for CustomSmartPointer {
    fn drop(&mut self) {
        println!("Dropping CustomSmartPointer with data `{}`!", self.data);
    }
}
// 先调用 d 的 drop 方法，再调用 c 的 drop 方法
fn main() {
    let c = CustomSmartPointer {
        data: String::from("my stuff"),
    };
    let d = CustomSmartPointer {
        data: String::from("other stuff"),
    };
    println!("CustomSmartPointers created.");
}

// drop 不允许被主动调用，以下代码会编译失败
fn main() {
    let c = CustomSmartPointer {
        data: String::from("some data"),
    };
    println!("CustomSmartPointer created.");
    c.drop();
    println!("CustomSmartPointer dropped before the end of main.");
}
// 可以通过 std::mem:drop 间接触发 drop 的调用
fn main() {
    let c = CustomSmartPointer {
        data: String::from("some data"),
    };
    println!("CustomSmartPointer created.");
    drop(c);
    println!("CustomSmartPointer dropped before the end of main.");
}

Rc

引用计数，可用于实现图

// 基础写法
enum List {
    Cons(i32, Rc<List>),
    Nil,
}

use crate::List::{Cons, Nil};
use std::rc::Rc;

fn main() {
    let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil)))));
    let b = Cons(3, Rc::clone(&a));  // 使用 clone 实现浅拷贝
    let c = Cons(4, Rc::clone(&a));
}

// 使用 strong_count 获取当前变量被引用的数量
fn main() {
    let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil)))));
    println!("count after creating a = {}", Rc::strong_count(&a));
    let b = Cons(3, Rc::clone(&a));
    println!("count after creating b = {}", Rc::strong_count(&a));
    {
        let c = Cons(4, Rc::clone(&a));
        println!("count after creating c = {}", Rc::strong_count(&a));
    }
    println!("count after c goes out of scope = {}", Rc::strong_count(&a));
}

RefCell

和 Box 与 RefCell 不同的是，RefCell 可以在 runtime 期间执行 immutable or mutable borrows checked

如果检查不通过，那么程序会 panic

选择 Box<T>, Rc<T>, or RefCell<T> 的参考：
Rc<T> 允许多个 owner，Box<T>和 RefCell<T> 只允许一个 owner
Rc<T> enables multiple owners of the same data; Box<T> and RefCell<T> have single owners.
Box<T> 允许在编译时 immutable or mutable borrows checked; Rc<T> 只允许编译时 immutable borrows checked; RefCell<T> 允许运行时 immutable or mutable borrows checked
Box<T> allows immutable or mutable borrows checked at compile time; Rc<T> allows only immutable borrows checked at compile time; RefCell<T> allows immutable or mutable borrows checked at runtime.
因为 RefCell<T> 允许运行时 mutable borrows checked, 所以即使 RefCell<T> 是不可变的，也可以更改 RefCell<T> 内部变量值
Because RefCell<T> allows mutable borrows checked at runtime, you can mutate the value inside the RefCell<T> even when the RefCell<T> is immutable.

// 基础写法 - 和 Rc 配合使用
#[derive(Debug)]
enum List {
    Cons(Rc<RefCell<i32>>, Rc<List>),
    Nil,
}

use crate::List::{Cons, Nil};
use std::cell::RefCell;
use std::rc::Rc;

fn main() {
    let value = Rc::new(RefCell::new(5));

    let a = Rc::new(Cons(Rc::clone(&value), Rc::new(Nil)));

    let b = Cons(Rc::new(RefCell::new(3)), Rc::clone(&a));
    let c = Cons(Rc::new(RefCell::new(4)), Rc::clone(&a));

    *value.borrow_mut() += 10;

    println!("a after = {:?}", a);
    println!("b after = {:?}", b);
    println!("c after = {:?}", c);
}

Weak

使用强引用 Strong<T> 可能会造成循环引用进而导致内存泄漏

通过 Rc::downgrade，可以生成弱引用 Weak<T> ，同时会使 weak_count 自增 1

当 strong_count 为 0 时，weak_count 即使不为 0 ，相关 ref 也会被释放

弱引用需要调用 upgrade 方法动态获取被引用的数据（返回类型为 Option ）

// 可以通过该例直观感受弱引用、强引用使用上的区别
use std::cell::RefCell;
use std::rc::{Rc, Weak};

#[derive(Debug)]
struct Node {
    value: i32,
    parent: RefCell<Weak<Node>>,
    children: RefCell<Vec<Rc<Node>>>,
}

fn main() {
    let leaf = Rc::new(Node {
        value: 3,
        parent: RefCell::new(Weak::new()),
        children: RefCell::new(vec![]),
    });

    println!(
        "leaf strong = {}, weak = {}",
        Rc::strong_count(&leaf),
        Rc::weak_count(&leaf),
    );

    {
        let branch = Rc::new(Node {
            value: 5,
            parent: RefCell::new(Weak::new()),
            children: RefCell::new(vec![Rc::clone(&leaf)]),
        });

        *leaf.parent.borrow_mut() = Rc::downgrade(&branch);

        println!(
            "branch strong = {}, weak = {}",
            Rc::strong_count(&branch),
            Rc::weak_count(&branch),
        );

        println!(
            "leaf strong = {}, weak = {}",
            Rc::strong_count(&leaf),
            Rc::weak_count(&leaf),
        );
    }

    println!("leaf parent = {:?}", leaf.parent.borrow().upgrade());
    println!(
        "leaf strong = {}, weak = {}",
        Rc::strong_count(&leaf),
        Rc::weak_count(&leaf),
    );
}

以上程序运行输出如下

leaf strong = 1, weak = 0
branch strong = 1, weak = 1
leaf strong = 2, weak = 0
leaf parent = None
leaf strong = 1, weak = 0

Origin

https://doc.rust-lang.org/book/ch15-00-smart-pointers.html

…

Using `Box<T>` to Point to Data on the Heap

The most straightforward smart pointer is a box, whose type is written Box<T>. Boxes allow you to store data on the heap rather than the stack. What remains on the stack is the pointer to the heap data. Refer to Chapter 4 to review the difference between the stack and the heap.

Boxes don’t have performance overhead, other than storing their data on the heap instead of on the stack. But they don’t have many extra capabilities either. You’ll use them most often in these situations:

When you have a type whose size can’t be known at compile time and you want to use a value of that type in a context that requires an exact size
When you have a large amount of data and you want to transfer ownership but ensure the data won’t be copied when you do so
When you want to own a value and you care only that it’s a type that implements a particular trait rather than being of a specific type

We’ll demonstrate the first situation in the “Enabling Recursive Types with Boxes” section. In the second case, transferring ownership of a large amount of data can take a long time because the data is copied around on the stack. To improve performance in this situation, we can store the large amount of data on the heap in a box. Then, only the small amount of pointer data is copied around on the stack, while the data it references stays in one place on the heap. The third case is known as a trait object, and Chapter 17 devotes an entire section, “Using Trait Objects That Allow for Values of Different Types,” just to that topic. So what you learn here you’ll apply again in Chapter 17!