3 min read
On this page

Threads & Shared State

Rust's concurrency story is built on one guarantee: data races are compile-time errors. The ownership system, combined with Send and Sync traits, ensures that if your code compiles, it is free from data races. This does not prevent all concurrency bugs (deadlocks are still possible), but it eliminates the most common and dangerous class.

Spawning Threads

std::thread::spawn creates an OS thread. It takes a closure and returns a JoinHandle:

use std::thread;
use std::time::Duration;

fn main() {
    let handle = thread::spawn(|| {
        for i in 1..=5 {
            println!("spawned thread: {}", i);
            thread::sleep(Duration::from_millis(100));
        }
        42 // return value
    });

    for i in 1..=3 {
        println!("main thread: {}", i);
        thread::sleep(Duration::from_millis(150));
    }

    let result = handle.join().unwrap();
    println!("Thread returned: {}", result);
}
main thread: 1
spawned thread: 1
spawned thread: 2
main thread: 2
spawned thread: 3
main thread: 3
spawned thread: 4
spawned thread: 5
Thread returned: 42

join() blocks until the thread finishes and returns its value. Always join your threads — a detached thread that outlives main gets killed silently.

Moving Data into Threads

Closures passed to spawn must be 'static — they cannot borrow from the calling scope. Use move to transfer ownership:

use std::thread;

fn main() {
    let data = vec![1, 2, 3, 4, 5];

    let handle = thread::spawn(move || {
        let sum: i32 = data.iter().sum();
        println!("Sum: {}", sum);
    });

    // data is no longer accessible here — it moved into the thread
    handle.join().unwrap();
}
Sum: 15

If you need the data in both places, clone it before the move. If you need shared mutable access, use Arc<Mutex<T>>.

Arc<Mutex<T>> for Shared Mutable State

Arc (atomic reference count) provides shared ownership across threads. Mutex provides interior mutability with locking. Together they give you thread-safe shared mutable state:

use std::sync::{Arc, Mutex};
use std::thread;

fn main() {
    let counter = Arc::new(Mutex::new(0));
    let mut handles = vec![];

    for _ in 0..10 {
        let counter = Arc::clone(&counter);
        let handle = thread::spawn(move || {
            let mut num = counter.lock().unwrap();
            *num += 1;
        });
        handles.push(handle);
    }

    for handle in handles {
        handle.join().unwrap();
    }

    println!("Final count: {}", *counter.lock().unwrap());
}
Final count: 10

The type system enforces correctness: you cannot access the inner value without locking, and the lock guard automatically releases when dropped. Try to forget the lock — the compiler will not let you.

Why the compiler prevents data races

The Send and Sync marker traits are the mechanism:

  • Send: a type can be transferred to another thread
  • Sync: a type can be referenced from multiple threads simultaneously

Rc<T> is not Send — trying to use it across threads is a compile error. Arc<T> is Send because its reference counting is atomic. The compiler checks these traits automatically.

use std::rc::Rc;
use std::thread;

fn main() {
    let data = Rc::new(42);
    // This will NOT compile:
    // thread::spawn(move || println!("{}", data));
    // error: Rc<i32> cannot be sent between threads safely
}

Channels for Message Passing

std::sync::mpsc provides multi-producer, single-consumer channels. This is the "share by communicating" approach:

use std::sync::mpsc;
use std::thread;
use std::time::Duration;

fn main() {
    let (tx, rx) = mpsc::channel();

    // Spawn multiple producers
    for id in 0..3 {
        let tx = tx.clone();
        thread::spawn(move || {
            let messages = vec![
                format!("worker {}: starting", id),
                format!("worker {}: processing", id),
                format!("worker {}: done", id),
            ];
            for msg in messages {
                tx.send(msg).unwrap();
                thread::sleep(Duration::from_millis(50));
            }
        });
    }

    // Drop the original sender so the channel closes
    // when all cloned senders are dropped
    drop(tx);

    // Receive until all senders are gone
    for received in rx {
        println!("{}", received);
    }
}
worker 0: starting
worker 1: starting
worker 2: starting
worker 0: processing
worker 1: processing
worker 2: processing
worker 0: done
worker 1: done
worker 2: done

Channels are ideal when you want to decouple producers from consumers. The rx iterator blocks until all senders are dropped, providing a natural shutdown mechanism.

For bounded channels (backpressure), use mpsc::sync_channel(capacity).

Rayon for Easy Parallelism

When you want to parallelize data processing without manual thread management, Rayon is the standard choice. It provides parallel iterators that look almost identical to regular iterators:

use rayon::prelude::*;

fn is_prime(n: u64) -> bool {
    if n < 2 {
        return false;
    }
    let limit = (n as f64).sqrt() as u64;
    (2..=limit).all(|i| n % i != 0)
}

fn main() {
    // Sequential
    let count_seq = (2..1_000_000u64)
        .filter(|&n| is_prime(n))
        .count();

    // Parallel — just change .iter() to .par_iter()
    let count_par = (2..1_000_000u64)
        .into_par_iter()
        .filter(|&n| is_prime(n))
        .count();

    assert_eq!(count_seq, count_par);
    println!("Primes under 1M: {}", count_par);
}
Primes under 1M: 78498

Rayon handles work-stealing, thread pool management, and load balancing. You change one method call and get parallelism. It is appropriate for CPU-bound work where tasks are independent.

A Real-World Example: Parallel File Processing

use std::sync::{Arc, Mutex};
use std::thread;
use std::sync::mpsc;

struct FileResult {
    path: String,
    line_count: usize,
}

fn count_lines(path: &str) -> usize {
    // Simplified — in real code, use std::fs
    path.len() * 10 // placeholder
}

fn main() {
    let files = vec![
        "src/main.rs",
        "src/lib.rs",
        "src/config.rs",
        "src/handler.rs",
    ];

    let (tx, rx) = mpsc::channel();

    for file in files {
        let tx = tx.clone();
        let path = file.to_string();
        thread::spawn(move || {
            let count = count_lines(&path);
            tx.send(FileResult {
                path,
                line_count: count,
            })
            .unwrap();
        });
    }

    drop(tx);

    let mut total = 0;
    for result in rx {
        println!("{}: {} lines", result.path, result.line_count);
        total += result.line_count;
    }
    println!("Total: {} lines", total);
}
src/main.rs: 110 lines
src/lib.rs: 100 lines
src/config.rs: 130 lines
src/handler.rs: 140 lines
Total: 480 lines

Common Pitfalls

  • Deadlocks from lock ordering — if thread A locks mutex 1 then mutex 2, and thread B locks mutex 2 then mutex 1, deadlock. Always acquire locks in a consistent order.
  • Holding locks too long — lock, copy the data, unlock. Do not hold a MutexGuard across an await point or a long computation.
  • Using Rc instead of ArcRc is not thread-safe. The compiler catches this, but the error message can be confusing if you do not know why.
  • Forgetting to drop the sender — if you clone a channel sender and forget to drop the original, the receiver never sees the channel close.
  • Spawning too many threads — OS threads have overhead (stack space, scheduling). For thousands of concurrent tasks, use async instead.
  • Ignoring JoinHandle — a dropped JoinHandle detaches the thread. It keeps running but you lose the ability to wait for it or catch its panic.

Key Takeaways

  • Rust prevents data races at compile time through Send and Sync traits. If it compiles, there are no data races.
  • Arc<Mutex<T>> is the standard pattern for shared mutable state across threads.
  • Channels (mpsc) decouple producers and consumers. Use them when message passing is cleaner than shared state.
  • Rayon makes data-parallel workloads trivial — swap .iter() for .par_iter().
  • Threads are for CPU-bound work and coarse parallelism. For I/O-bound concurrency with many tasks, async is the better tool.