Modern C++ Multithreading Techniques

Concurrency and multithreading have become increasingly important in the world of software development as the need to leverage multiple cores and processors for improved performance and efficiency grows. With the advent of modern C++, developers now have access to a robust set of multithreading tools and techniques that can help them create more efficient and powerful applications. This article aims to delve deep into modern C++ multithreading techniques, focusing on the features introduced by the C++11, C++14, and C++17 standards. Our target audience is experienced developers who are looking to expand their knowledge of concurrent programming in C++.

1. Overview of C++ Multithreading Concepts

1.1 Threads

A thread is the smallest unit of execution within a process. Multithreading allows multiple threads to run concurrently within a single process, sharing resources such as memory and file handles. Threads within a process can run concurrently on different CPU cores, enabling efficient parallel execution of tasks.

1.2 Concurrency vs. Parallelism

Concurrency refers to the ability of a program to manage multiple tasks simultaneously, while parallelism is the actual simultaneous execution of these tasks. Concurrency can be achieved without parallelism, as in the case of a single-core processor, where tasks are interleaved.

1.3 Synchronization

Thread synchronization is crucial for preventing data races and ensuring the correctness of multithreaded programs. This can be achieved using various techniques, such as mutexes, locks, and condition variables.

2. C++11: Standard Threading Library

C++11 introduced the standard threading library, which provided an extensive set of tools and utilities to manage threads, synchronization primitives, and thread-local storage.

2.1 Creating and Managing Threads

The std::thread class in C++11 allows for the creation and management of threads. To create a new thread, simply instantiate the std::thread object with a function or a callable object. The newly created thread starts executing the provided function. To wait for the completion of a thread, use the join() method.

#include <iostream>
#include <thread>

void my_function() {
  std::cout << "Hello from my_function!" << std::endl;
}

int main() {
  std::thread t(my_function);
  t.join();
  return 0;
}Code language: C++ (cpp)

2.2 Synchronization Primitives

2.2.1 Mutexes

A mutex (short for “mutual exclusion”) is a synchronization primitive used to ensure that only one thread can access a shared resource at a time. In C++11, the std::mutex class provides a basic mutex implementation.

#include <iostream>
#include <mutex>
#include <thread>

std::mutex mtx;

void print_hello() {
  std::unique_lock<std::mutex> lock(mtx);
  std::cout << "Hello, world!" << std::endl;
  lock.unlock();
}

int main() {
  std::thread t1(print_hello);
  std::thread t2(print_hello);
  t1.join();
  t2.join();
  return 0;
}Code language: C++ (cpp)

2.2.2 Locks

C++11 provides various lock classes, such as std::lock_guard and std::unique_lock, which can be used to manage the ownership and locking of mutexes.

2.2.3 Condition Variables

A condition variable is a synchronization primitive used to block a thread until a particular condition is met. C++11 provides the std::condition_variable class for this purpose.

2.3 Thread-Local Storage

Thread-local storage (TLS) allows each thread to have its own instance of a variable, ensuring that the value of the variable is not shared between threads. C++11 introduces the thread_local keyword for defining thread-local variables.

#include <iostream>
#include <thread>

thread_local int counter = 0;

void increment_counter() {
  ++counter;
  std::cout << "Counter: " << counter << std::endl;
}

int main() {
  std::thread t1(increment_counter);
  std::thread t2(increment_counter);
  t1.join();
  t2.join();
  return 0;
}Code language: C++ (cpp)

3. C++14 and C++17: Enhancements and New Features

C++14 and C++17 brought several improvements and new features to the world of multithreading, making it even more powerful and easier to use.

3.1 C++14: Shared Locks

C++14 introduced the std::shared_timed_mutex and std::shared_lock classes, allowing for shared ownership of a mutex. This is particularly useful when multiple threads need to read a shared resource but only one thread can modify it.

#include <iostream>
#include <shared_mutex>
#include <thread>

std::shared_timed_mutex mtx;
int shared_data = 0;

void reader() {
  std::shared_lock<std::shared_timed_mutex> lock(mtx);
  std::cout << "Reader: shared_data = " << shared_data << std::endl;
}

void writer() {
  std::unique_lock<std::shared_timed_mutex> lock(mtx);
  ++shared_data;
  std::cout << "Writer: shared_data = " << shared_data << std::endl;
}

int main() {
  std::thread r1(reader);
  std::thread r2(reader);
  std::thread w1(writer);
  r1.join();
  r2.join();
  w1.join();
  return 0;
}Code language: C++ (cpp)

3.2 C++17: Parallel Algorithms

C++17 introduced parallel algorithms as part of the standard library, allowing developers to leverage parallelism with ease. These algorithms operate on standard containers and are designed to automatically parallelize their execution when possible. Developers can specify the desired level of parallelism using execution policies, such as std::execution::seq, std::execution::par, and std::execution::par_unseq.

#include <algorithm>
#include <execution>
#include <vector>
#include <iostream>

int main() {
  std::vector<int> data{5, 2, 8, 1, 3, 6, 4, 7};
  std::sort(std::execution::par, data.begin(), data.end());

  for (const auto& value : data) {
    std::cout << value << " ";
  }
  std::cout << std::endl;

  return 0;
}Code language: C++ (cpp)

4. Best Practices for Modern C++ Multithreading

4.1 Avoid Global Variables

Global variables can lead to data races and synchronization issues. Whenever possible, use local variables or thread-local storage instead.

4.2 Minimize Lock Contention

Lock contention occurs when multiple threads compete for the same lock. To minimize lock contention, minimize the time spent holding locks and avoid nested locking.

4.3 Use Fine-Grained Locks

Instead of using a single lock for an entire data structure, use multiple fine-grained locks to protect smaller parts of the structure. This allows for greater concurrency and reduces lock contention.

4.4 Use Lock-Free Data Structures

Lock-free data structures can provide better performance and scalability in certain scenarios. Consider using lock-free data structures, such as those provided by the C++ Concurrency TS, when appropriate.

Example Exercise: Multithreaded Web Crawler

Objective: Create a simple multithreaded web crawler that downloads and extracts URLs from web pages using modern C++ multithreading techniques.

Task Description: Implement a web crawler that starts with a seed URL, fetches the content of the page, extracts all the URLs present in the page, and continues crawling the extracted URLs. The crawler should use multiple threads to download and process web pages concurrently. Additionally, the crawler should store visited URLs to avoid revisiting the same pages.

To achieve this, we’ll use the following libraries:

libcurl for fetching web pages.
htmlcxx for parsing HTML and extracting URLs.
C++ standard threading library for multithreading.

Source code:

#include <iostream>
#include <string>
#include <unordered_set>
#include <vector>
#include <queue>
#include <mutex>
#include <thread>
#include <condition_variable>
#include <curl/curl.h>
#include <htmlcxx/html/ParserDom.h>

// Global variables
std::unordered_set<std::string> visited_urls;
std::queue<std::string> urls_to_visit;
std::mutex mtx;
std::condition_variable cv;

// CURL write callback to store fetched content
size_t write_callback(void* contents, size_t size, size_t nmemb, void* userp) {
    ((std::string*)userp)->append((char*)contents, size * nmemb);
    return size * nmemb;
}

// Fetch the content of the given URL
std::string fetch_url(const std::string& url) {
    std::string content;
    CURL* curl = curl_easy_init();

    if (curl) {
        curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
        curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_callback);
        curl_easy_setopt(curl, CURLOPT_WRITEDATA, &content);
        curl_easy_perform(curl);
        curl_easy_cleanup(curl);
    }

    return content;
}

// Extract URLs from the content of a web page
std::vector<std::string> extract_urls(const std::string& content) {
    htmlcxx::HTML::ParserDom parser;
    tree<htmlcxx::HTML::Node> dom = parser.parseTree(content);

    std::vector<std::string> urls;

    for (auto it = dom.begin(); it != dom.end(); ++it) {
        if (it->tagName() == "a") {
            it->parseAttributes();
            std::string url = it->attribute("href").second;
            if (!url.empty()) {
                urls.push_back(url);
            }
        }
    }

    return urls;
}

// Crawl function that runs in multiple threads
void crawl() {
    while (true) {
        std::unique_lock<std::mutex> lock(mtx);
        cv.wait(lock, []{ return !urls_to_visit.empty(); });

        std::string url = urls_to_visit.front();
        urls_to_visit.pop();
        lock.unlock();

        if (visited_urls.find(url) == visited_urls.end()) {
            std::string content = fetch_url(url);
            std::vector<std::string> extracted_urls = extract_urls(content);

            lock.lock();
            visited_urls.insert(url);
            for (const auto& extracted_url : extracted_urls) {
                if (visited_urls.find(extracted_url) == visited_urls.end()) {
                    urls_to_visit.push(extracted_url);
                }
            }
            lock.unlock();

            std::cout << "Visited URL: " << url << std::endl;
        }

        cv.notify_one();
    }
}

int main() {
    // Initialize CURL
    curl_global_init(CURL_GLOBAL_DEFAULT);

    // Seed URL
    std::string seed_url = "https://www.example.com";
    urls_to_visit.push(seed_url);

    // Number of threads for the web crawler
    const unsigned int num_threads = 4;
    std::vector<std::thread> threads;

    // Create and start the threads
    for (unsigned int i = 0; i < num_threads; ++i) {
        threads.emplace_back(crawl);
    }

    // Set a limit for the number of visited URLs
    const size_t max_visited_urls = 100;

    // Wait for the crawler to finish
    while (true) {
        std::unique_lock<std::mutex> lock(mtx);
        cv.wait(lock, [&]{ return visited_urls.size() >= max_visited_urls; });
        break;
    }

    // Stop the threads and wait for them to finish
    for (auto& thread : threads) {
        if (thread.joinable()) {
            thread.detach();
        }
    }

    // Cleanup CURL
    curl_global_cleanup();

    return 0;
}Code language: C++ (cpp)

This example demonstrates a multithreaded web crawler that fetches web pages, extracts URLs, and continues crawling the extracted URLs. The program uses the C++ standard threading library to manage threads, the libcurl library to fetch web pages, and the htmlcxx library to parse HTML and extract URLs. To keep track of visited URLs and URLs to be visited, the crawler uses a std::unordered_set and a std::queue, respectively, with proper synchronization using std::mutex and std::condition_variable.

Note that this example is for illustrative purposes and is not intended for use in production environments. A production-ready web crawler would require proper error handling, URL normalization, URL filtering, and adherence to the robots.txt standard, among other considerations.