A multi-threaded program has multiple entry points of execution. Each thread is like a separate process, but they share the same address space. (”a lightweight process”)
Switching threads requires context switch. Each thread has a Thread Control Block. In a legacy process, there is 1 stack at the bottom of the address space. In multi-threaded processes, each thread has a stack (but in shared address space).

Stack of a process with 2 threads
When a process starts, the initial thread (main thread) is created. Use pthread_create() (included in pthread.h) to create another thread. If we want a thread to terminate, use pthread_join().
The outcome of creating threads is unpredictable: the scheduler decides which to run first.
When a thread fetches global data, it fetches the data, modifies it, and writes it back. When two threads does it together, the output wouldn’t be deterministic(确定的).
The situation above is called race condition(竞态条件). Multiple threads executing this code can result in race condition, and we call this code a critical section(临界区). We want the mutual exclusion (mutex, 互斥): if a thread is executing within the critical section, the others will be prevented from doing so.
The hardware may provide some instructions to create a general set of synchronization primitive (同步原语). It guarantees that only a single thread ever enters a critical section, thus avoiding races and resulting in deterministic program outputs..