Skip to content

Concurrency and Parallelism

Posted on:December 3, 2023 at 09:18 PM

I saw Steve Klabnik’s fantastic, and highly recommended talk on parallel, concurrent, and async programming in Rust’s Journey to Async/Await.

After watching, I decided to condense the main points to solidify my grasp of it and add pieces that might be helpful.

Table of contents

Open Table of contents

Definitions

Synchronous code

When working with synchronous code we can think of it as a simple calculation function. Let’s call this a task.

function add(first: number, second: number): number {
  return first + second;
}

It will run on a single thread, and use that thread by itself for the computation.

Similarly, we can do I/O operations in a synchronous fashion. Below are some examples of tasks in TypeScript, Rust, and Python.

import fs from "fs";

function readFileSyncExample(filePath: string): string {
  // This will block the event loop until the file is read
  // We'll talk more about the event loop later
  return fs.readFileSync(filePath, "utf8");
}

function writeFileSyncExample(filePath: string, data: string): void {
  // This will block the event loop until the file is written
  // We'll talk more about the event loop later
  fs.writeFileSync(filePath, data);
}

In this case, our CPU will read or write by issuing an I/O request to the disk. The disk, in turn, has its own hardware designed solely for processing commands from the CPU. This means, that we don’t necessarily need to wait for the I/O request to finish before we move on to the next task. The same goes for network requests, where the command from the CPU gets handled by the network interface card.

When any of these external devices are done with the request, they store the data in RAM by a feature called DMA - Direct memory access, where they can bypass the CPU and write directly to RAM.

When done, they send back control to the CPU by letting it know that the data has been loaded into RAM.

All of this is to say, that there is no need for us to let our CPU sit and idle while waiting for the I/O to finish. We could initiate an I/O task, and then move on with another task, like running the add function calculation above, before continuing with the I/O.

So let’s dive into concurrency, meaning we can do multiple things, but not at the exact same time.

Asynchronous code

We’ve looked at a simple case where it would be performant to initiate an I/O operation, hand over its responsibility, and continue with a CPU-bound task, before finishing the I/O.

Think of the code earlier added together

import fs from "fs";

function add(first: number, second: number): number {
  return first + second;
}

function readFileSyncExample(filePath: string): string {
  // This will block the event loop until the file is read
  // We'll talk more about the event loop later
  return fs.readFileSync(filePath, "utf8");
}

function runTasksSequentially() {
  // Start and wait for I/O task to finish
  const fileContents = readFileSyncExample("path/to/your/file.txt");
  console.log("File contents:", fileContents);
  // Now start the CPU-bound task
  const total = add(1, 2);
  console.log("Total:", total);
}

runTasksSequentially();
// File contents: <content>
// Total: 3

Instead of waiting for the file content to be loaded, we can use different methods to let our thread continue with the CPU work instead of waiting for the I/O.

Two common ways are to either spawn a new virtual thread or to use an event loop. There are also green threads, which often gets confused with virtual threads.

These implementations depend on their respective runtime. For example, node.js has an event loop, just like the server nginx. And golang uses virtual threads in their goroutines.

Python is an example where we have:

Don’t worry, we’ll explain all in more detail later.

Let’s get back to our example, and this time make sure we initialize the I/O and then continue with the CPU-bound calculation during the I/Os work.

import fs from "fs/promises"; // this is the async import

function add(first: number, second: number): number {
  return first + second;
}

// A Promise means something in the future. In below case, it will
// either return a `string` or throw an error.
function readFileSyncExample(filePath: string): Promise<string> {
  // We'll talk more about the event loop later
  return fs.readFileSync(filePath, "utf8");
}

// we mark the function as async
async function runTasksConcurrently() {
  // Initiate and hand over the I/O task
  const fileContentPromise = readFileSyncExample("path/to/your/file.txt");
  // Now start the CPU-bound task performCpuIntensiveTask();
  const total = add(1, 2);
  console.log("Total:", total);
  // Grab the results from the I/O task
  const fileContents = await fileContentPromise;
  console.log("File contents:", fileContents);
}

runTasksConcurrently();
// Total: 3
// File contents: <content>

This is a very simple and fast example, but imagine if the I/O is a 500ms call to an external API, and you can quickly see how much time we’d waste being blocked by the I/O.

We’ve now successfully moved from synchronous, blocking code, into asynchronous concurrent code.

Let’s talk a bit more about the different kinds of concurrency. We’ll look at the event loop and the virtual (or green) threads. There are more ways of doing this, but I will limit it to these two as they are the most prevalent.

Threading

Definitions are difficult, and it seems to me that the community has not decided on the exact definition of a virtual thread vs a green thread. (insert links)

Here, we will call a virtual thread a thread that is controlled by the runtime where the runtime can also use the underlying OS threads. So we can spawn as many M (virtual) threads as we want on any of the available OS threads (N).

Green threads are run by the runtime only, and not using any additional underlying OS threads.

Native threads are the threads the OS has access to. In other words, bound by the CPUs and cores on the hardware level.

This gives us the below (slightly modified) table from user Reg at stackoverflow

Thread TypeDescriptionJava Thread Type (M) : Native Threads(N)
Native ThreadsA wrapper for OS Threads.1:1
Green ThreadsRuns multiple “Green Threads” on a single OS Thread.M:1
Virtual ThreadsRuns multiple Virtual Threads on Multiple OS threadsM:N (M > N)

Using Pythons’s threading, we get below running on a thread, finishing the CPU add function before the I/O.

import threading

def add(first, second):
    return first + second

# Read file content in a separate thread
def read_file(file_path):
    with open(file_path, 'r') as file:
        return file.read()

# The main function that will run tasks concurrently
def run_tasks_concurrently(file_path):
    # Create a thread for the file reading
    file_thread = threading.Thread(target=read_file, args=(file_path,))
    file_thread.start()

    # Perform the CPU-bound task
    total = add(1, 2)
    print("Total:", total)

    # Wait for the file reading thread to finish
    file_thread.join()

    # Assuming the file reading function stores the result in a shared variable
    file_contents = read_file(file_path)
    print("File contents:", file_contents)

# Example file path, replace with your file path
file_path = 'path/to/your/file.txt'
run_tasks_concurrently(file_path)
# Total: 3
# File contents: <content>

There are different pros and cons to each threading style. But in essence, you can think of native threads as being very straightforward and where you can actually run multiple CPU processes at once. But they’re limited to the amount of available threads on an OS level.

Green threads, on the other hand, are limited to a single OS thread and not suitable for multiple CPU processes. But they’re fantastic when wanting to do multiple I/O operations where you can spawn huge amounts of threads.

Finally, virtual threads handle both cases. Where the runtime can handle multiple CPU processing tasks given its access to OS threads, and also spawn large amounts of its own threads for I/O. The downside is that it takes extra overhead to have its own runtime and scheduler, like with green threads. One can also run into problems with too many threads being spawned at once and additional latency when calling into the underlying OS threads.

Moving on, let’s have a look at the event loop mentioned in the TypeScript code.

Event Loop

There are other ways to work around the need to have one thread or process per I/O operation, with an event loop we can coalesce many of them in a single process.

In practice, you can think of it as each I/O task is an event that gets pushed to a queue. And each event has a handler. So the code looks synchronous again, but each operation is not blocking another.

This is because the queue is being handled by a separate OS thread “under the hood”. In node.js case, it uses libuv and you can read more about it in their docs. There are also some excellent detailed docs in the nodejs docs site about their event loop and why one should not block the event loop.

This way, we can write concurrent asynchronous code where all our I/O is non-blocking.

Parallel programming

As we’re starting to wrap up, let’s add an example of actual parallel code. Where we run the add function on two native threads using Rust’s standard library.

use std::thread;

fn add(first: i32, second: i32) -> i32 {
    first + second
}

fn main() {
    // Example: a list of tuples to add
    let number_pairs = vec![(1, 2), (3, 4), (5, 6), (7, 8)];

    // Spawn a thread for each addition operation
    let mut handles = Vec::new();
    for (a, b) in number_pairs {
        handles.push(thread::spawn(move || {
            add(a, b)
        }));
    }

    // Wait for all threads to complete and collect results
    let results: Vec<_> = handles.into_iter().map(|handle| handle.join().unwrap()).collect();

    println!("Results: {:?}", results);
}

If we tried this with a green thread as defined earlier, they would still not execute in parallel.

Wrap-up

To wrap it all up, let’s showcase an example where are running both concurrent and parallel code in one snippet. We’ll make use of native threads for the CPU-bound tasks and event loops for the IO-bound tasks. In TypeScript we use node.js worker threads and the node runtime’s event loop. In Python we use multiprocessing and asyncio. Finally, in the Rust snippet we use native threads for the CPU-bound tasks and an event-loop provided by the tokio runtime for IO.

const {
  Worker: NodeWorker,
  isMainThread,
  parentPort,
} = require("worker_threads");

// Function to perform a simulated asynchronous task
async function fetchData(seconds: number): Promise<[number, number]> {
  const startTime = new Date().getTime();
  await new Promise((resolve) => setTimeout(resolve, seconds * 1000));
  const endTime = new Date().getTime();
  return [startTime, endTime];
}

// Function to calculate the Fibonacci number
function calculateFibonacci(n: number): number {
  if (n <= 1) return n;
  return calculateFibonacci(n - 1) + calculateFibonacci(n - 2);
}

interface WorkerData {
  start: number;
  end: number;
}

// Main thread logic
if (isMainThread) {
  (async () => {
    // Asynchronous tasks
    let promises: Promise<[number, number]>[] = [];
    for (let i = 0; i < 3; i++) {
      promises.push(fetchData(3));
    }
    // Parallel tasks
    const workers = [];
    for (let i = 0; i < 3; i++) {
      const worker = new NodeWorker(__filename, {
        workerData: 30,
      });
      workers.push(worker);
      worker.on("message", (message: WorkerData) => {
        console.log(
          `Parallel: started at: ${message.start} ms, ended at: ${message.end} ms`
        );
      });
    }

    await Promise.all(
      promises.map(async (promise) => {
        const [start, end] = await promise;
        console.log(`Async:    started at: ${start} ms, ended at: ${end} ms`);
      })
    );
    await Promise.all(
      workers.map(
        (worker) => new Promise((resolve) => worker.on("exit", resolve))
      )
    );
  })();
} else {
  // Worker thread logic
  if (parentPort) {
    const startTime = new Date().getTime();
    calculateFibonacci(30);
    const endTime = new Date().getTime();
    const message: WorkerData = {
      start: startTime,
      end: endTime,
    };
    parentPort.postMessage(message);
  }
}

// Parallel: started at: 1701900169892 ms, ended at: 1701900169903 ms
// Parallel: started at: 1701900169892 ms, ended at: 1701900169903 ms
// Parallel: started at: 1701900169894 ms, ended at: 1701900169905 ms

// Async:    started at: 1701900169868 ms, ended at: 1701900172871 ms
// Async:    started at: 1701900169869 ms, ended at: 1701900172871 ms
// Async:    started at: 1701900169869 ms, ended at: 1701900172871 ms

As you can see, in all of the above examples we start and end all concurrent and parallel tasks around the same time. Proving we can do both.

I’m hoping this will all have helped categorize what is what in the world of asynchronous programming.

There are more topics if you are curious to go deeper, e.g. Cooperative Multitasking and Preemptive Multitasking. But for brevity’s sake, I’ll stop here with this condensed and general overview.

Thanks for reading and please feel free to reach out if you have questions, spot some errors, or simply have any general feedback. I would love to hear from you!