Learning Golang Synchronization Mechanism

1. The role of synchronization mechanisms

In concurrent programming, when multiple goroutines share resources, it is necessary to ensure that the resources can only be accessed by one goroutine at a time to prevent race conditions. This requires the use of synchronization mechanisms. Synchronization mechanisms can coordinate the access order of different goroutines to shared resources, ensuring data consistency and state synchronization in a concurrent environment.

Go language provides a rich set of synchronization mechanisms, including but not limited to:

Mutexes (sync.Mutex) and read-write mutexes (sync.RWMutex)
Channels
WaitGroups
Atomic functions (atomic package)
Condition variables (sync.Cond)

2. Synchronization primitives

2.1 Mutex (sync.Mutex)

2.1.1 Concept and role of mutex

Mutex is a synchronization mechanism that ensures the safe operation of shared resources by allowing only one goroutine to hold the lock to access the shared resource at any given time. Mutex achieves synchronization through the Lock and Unlock methods. Calling the Lock method will block until the lock is released, and at this point, other goroutines attempting to acquire the lock will wait. Calling Unlock releases the lock, allowing other waiting goroutines to acquire it.

var mu sync.Mutex

func criticalSection() {
    // Acquire the lock to exclusively access the resource
    mu.Lock()
    // Access the shared resource here
    // ...
    // Release the lock to allow other goroutines to acquire it
    mu.Unlock()
}

2.1.2 Practical usage of mutex

Suppose we need to maintain a global counter, and multiple goroutines need to increment its value. Using a mutex can ensure the accuracy of the counter.

var (
    mu      sync.Mutex
    counter int
)

func increment() {
    mu.Lock()         // Lock before modifying the counter
    counter++         // Safely increment the counter
    mu.Unlock()       // Unlock after the operation, allowing other goroutines to access the counter
}

func main() {
    for i := 0; i < 10; i++ {
        go increment()  // Start multiple goroutines to increment the counter value
    }
    // Wait for some time (in practice, you should use WaitGroup or other methods to wait for all goroutines to complete)
    time.Sleep(1 * time.Second)
    fmt.Println(counter)  // Output the value of the counter
}

2.2 Read-Write Mutex (sync.RWMutex)

2.2.1 Concept of read-write mutex

RWMutex is a special type of lock that allows multiple goroutines to read shared resources simultaneously, while write operations are exclusive. Compared to mutexes, read-write locks can improve the performance in multi-reader scenarios. It has four methods: RLock, RUnlock for locking and unlocking read operations, and Lock, Unlock for locking and unlocking write operations.

2.2.2 Practical Use Cases of Read-Write Mutex

In a database application, read operations may be far more frequent than write operations. Using a read-write lock can improve system performance because it allows multiple goroutines to read concurrently.

var (
    rwMu  sync.RWMutex
    data  int
)

func readData() int {
    rwMu.RLock()         // Acquire read lock, allowing other read operations to proceed concurrently
    defer rwMu.RUnlock() // Ensure that the lock is released using defer
    return data          // Safely read data
}

func writeData(newValue int) {
    rwMu.Lock()          // Acquire write lock, preventing other read or write operations at this time
    data = newValue      // Safely write the new value
    rwMu.Unlock()        // Unlock after writing is complete
}

func main() {
    go writeData(42)     // Start a goroutine to perform a write operation
    fmt.Println(readData()) // The main goroutine performs a read operation
    // Use WaitGroup or other synchronization methods to ensure that all goroutines are finished
}

In the example above, multiple readers can execute the readData function simultaneously, but a writer executing writeData will block new readers and other writers. This mechanism provides performance advantages for scenarios with more reads than writes.

2.3 Conditional Variables (`sync.Cond`)

2.3.1 Concept of Conditional Variables

In Go language's synchronization mechanism, conditional variables are used for waiting or notifying some condition changes as a synchronization primitive. Conditional variables are always used together with a mutex (sync.Mutex), which is used to protect the consistency of the condition itself.

The concept of conditional variables comes from the operating system domain, allowing a group of goroutines to wait for a certain condition to be met. More specifically, a goroutine may pause execution while waiting for a condition to be met, and another goroutine may notify other goroutines to resume execution after changing the condition using the conditional variable.

In the Go standard library, conditional variables are provided through the sync.Cond type, and its main methods include:

Wait: Calling this method will release the held lock and block until another goroutine calls Signal or Broadcast on the same conditional variable to wake it up, after which it will attempt to acquire the lock again.
Signal: Wakes up one goroutine waiting for this conditional variable. If no goroutine is waiting, calling this method will have no effect.
Broadcast: Wakes up all goroutines waiting for this conditional variable.

Conditional variables should not be copied, so they are generally used as a pointer field of a certain struct.

2.3.2 Practical Cases of Condition Variables

Here is an example using condition variables that demonstrates a simple producer-consumer model:

package main

import (
    "fmt"
    "sync"
    "time"
)

// SafeQueue is a safe queue protected by a mutex
type SafeQueue struct {
    mu    sync.Mutex
    cond  *sync.Cond
    queue []interface{}
}

// Enqueue adds an element to the end of the queue and notifies waiting goroutines
func (sq *SafeQueue) Enqueue(item interface{}) {
    sq.mu.Lock()
    defer sq.mu.Unlock()

    sq.queue = append(sq.queue, item)
    sq.cond.Signal() // Notify waiting goroutines that the queue is not empty
}

// Dequeue removes an element from the head of the queue, waits if the queue is empty
func (sq *SafeQueue) Dequeue() interface{} {
    sq.mu.Lock()
    defer sq.mu.Unlock()

    // Wait when the queue is empty
    for len(sq.queue) == 0 {
        sq.cond.Wait() // Wait for a change in condition
    }

    item := sq.queue[0]
    sq.queue = sq.queue[1:]
    return item
}

func main() {
    queue := make([]interface{}, 0)
    sq := SafeQueue{
        mu:    sync.Mutex{},
        cond:  sync.NewCond(&sync.Mutex{}),
        queue: queue,
    }

    // Producer Goroutine
    go func() {
        for i := 0; i < 5; i++ {
            time.Sleep(1 * time.Second)         // Simulate production time
            sq.Enqueue(fmt.Sprintf("item%d", i)) // Produce an element
            fmt.Println("Produce:", i)
        }
    }()

    // Consumer Goroutine
    go func() {
        for i := 0; i < 5; i++ {
            item := sq.Dequeue() // Consume an element, wait if the queue is empty
            fmt.Printf("Consume: %v\n", item)
        }
    }()

    // Wait for a sufficient amount of time to ensure all production and consumption are completed
    time.Sleep(10 * time.Second)
}

In this example, we have defined a SafeQueue structure with an internal queue and a condition variable. When the consumer calls the Dequeue method and the queue is empty, it waits using the Wait method. When the producer calls the Enqueue method to enqueue a new element, it uses the Signal method to awaken the waiting consumer.

2.4 WaitGroup

2.4.1 Concept and Usage of WaitGroup

sync.WaitGroup is a synchronization mechanism used to wait for a group of goroutines to complete. When you start a goroutine, you can increase the counter by calling the Add method, and each goroutine can call the Done method (which actually performs Add(-1)) when it's done. The main goroutine can block by calling the Wait method until the counter reaches 0, indicating that all the goroutines have completed their tasks.

When using WaitGroup, the following points should be noted:

Add, Done, and Wait methods are not thread-safe and should not be called concurrently in multiple goroutines.
The Add method should be called before the newly created goroutine starts.

2.4.2 Practical Use Cases of WaitGroup

Here's an example of using WaitGroup:

package main

import (
	"fmt"
	"sync"
	"time"
)

func worker(id int, wg *sync.WaitGroup) {
	defer wg.Done() // Notify WaitGroup upon completion

	fmt.Printf("Worker %d starting\n", id)
	time.Sleep(time.Second) // Simulate time-consuming operation
	fmt.Printf("Worker %d done\n", id)
}

func main() {
	var wg sync.WaitGroup

	for i := 1; i <= 5; i++ {
		wg.Add(1) // Increment the counter before starting goroutine
		go worker(i, &wg)
	}

	wg.Wait() // Wait for all worker goroutines to finish
	fmt.Println("All workers done")
}

In this example, the worker function simulates the execution of a task. In the main function, we start five worker goroutines. Before starting each goroutine, we call wg.Add(1) to notify the WaitGroup that a new task is being executed. When each worker function completes, it calls defer wg.Done() to notify the WaitGroup that the task is done. After starting all the goroutines, the main function blocks at wg.Wait() until all workers report completion.

2.5 Atomic Operations (`sync/atomic`)

2.5.1 Concept of Atomic Operations

Atomic operations refer to operations in concurrent programming that are indivisible, meaning they are not interrupted by other operations during execution. For multiple goroutines, using atomic operations can ensure data consistency and state synchronization without the need for locking, as atomic operations themselves guarantee atomicity of execution.

In Go language, the sync/atomic package provides low-level atomic memory operations. For basic data types such as int32, int64, uint32, uint64, uintptr, and pointer, methods from the sync/atomic package can be used for safe concurrent operations. The importance of atomic operations lies in being the cornerstone for building other concurrent primitives (such as locks and condition variables) and often being more efficient than locking mechanisms.

2.5.2 Practical Use Cases of Atomic Operations

Consider a scenario where we need to track the concurrent number of visitors to a website. Using a simple counter variable intuitively, we would increase the counter when a visitor arrives and decrease it when a visitor leaves. However, in a concurrent environment, this approach would lead to data races. Therefore, we can use the sync/atomic package to safely manipulate the counter.

package main

import (
	"fmt"
	"sync"
	"sync/atomic"
	"time"
)

var visitorCount int32

func incrementVisitorCount() {
	atomic.AddInt32(&visitorCount, 1)
}

func decrementVisitorCount() {
	atomic.AddInt32(&visitorCount, -1)
}

func main() {
	var wg sync.WaitGroup
	for i := 0; i < 100; i++ {
		wg.Add(1)
		go func() {
			incrementVisitorCount()
			time.Sleep(time.Second) // Time of visitor's visit
			decrementVisitorCount()
			wg.Done()
		}()
	}
	wg.Wait()
	fmt.Printf("Current visitor count: %d\n", visitorCount)
}

In this example, we create 100 goroutines to simulate the arrival and departure of visitors. By using the atomic.AddInt32() function, we ensure that the increments and decrements of the counter are atomic, even in highly concurrent situations, thereby ensuring the accuracy of visitorCount.

2.6 Channel Synchronization Mechanism

2.6.1 Synchronization Characteristics of Channels

Channels are a way for goroutines to communicate in the Go language at the language level. A channel provides the ability to send and receive data. When a goroutine tries to read data from a channel and the channel has no data, it will block until there is data available. Similarly, if the channel is full (for an unbuffered channel, this means it already has data), the goroutine trying to send data will also block until there is space to write. This feature makes channels very useful for synchronization between goroutines.

2.6.2 Use Cases of Synchronization with Channels

Suppose we have a task that needs to be completed by multiple goroutines, each handling a subtask, and then we need to aggregate the results of all subtasks. We can use a channel to wait for all goroutines to finish.

package main

import (
    "fmt"
    "sync"
)

func worker(id int, wg *sync.WaitGroup, resultChan chan<- int) {
    defer wg.Done()
    // Perform some operations...
    fmt.Printf("Worker %d starting\n", id)
    // Assume the subtask result is the worker's id
    resultChan <- id
    fmt.Printf("Worker %d done\n", id)
}

func main() {
    var wg sync.WaitGroup
    numWorkers := 5
    resultChan := make(chan int, numWorkers)

    for i := 0; i < numWorkers; i++ {
        wg.Add(1)
        go worker(i, &wg, resultChan)
    }

    go func() {
        wg.Wait()
        close(resultChan)
    }()

    // Collect all results
    for result := range resultChan {
        fmt.Printf("Received result: %d\n", result)
    }
}

In this example, we start 5 goroutines to perform tasks and collect the results through the resultChan channel. The main goroutine waits for all work to be completed in a separate goroutine and then closes the result channel. Afterwards, the main goroutine traverses the resultChan channel, collecting and printing the results of all the goroutines.

2.7 One-time Execution (`sync.Once`)

sync.Once is a synchronization primitive that ensures an operation is only executed once during the program's execution. A typical use of sync.Once is in the initialization of a singleton object or in scenarios requiring delayed initialization. Regardless of how many goroutines call this operation, it will only run once, hence the name of the Do function.

sync.Once perfectly balances concurrency issues and execution efficiency, eliminating concerns about performance problems caused by repeated initialization.

As a simple example to demonstrate the usage of sync.Once:

package main

import (
    "fmt"
    "sync"
)

var once sync.Once
var instance *Singleton

type Singleton struct{}

func Instance() *Singleton {
    once.Do(func() {
        fmt.Println("Creating single instance now.")
        instance = &Singleton{}
    })
    return instance
}

func main() {
    for i := 0; i < 10; i++ {
        go Instance()
    }
    fmt.Scanln() // Wait to see the output
}

In this example, even if the Instance function is concurrently called multiple times, the creation of the Singleton instance will only occur once. Subsequent calls will directly return the singleton instance created the first time, ensuring the uniqueness of the instance.

2.8 ErrGroup

ErrGroup is a library in Go language used to synchronize multiple goroutines and collect their errors. It is part of the "golang.org/x/sync/errgroup" package, providing a concise way to handle error scenarios in concurrent operations.

2.8.1 Concept of ErrGroup

The core idea of ErrGroup is to bind a group of related tasks (usually executed concurrently) together, and if one of the tasks fails, the execution of the entire group will be canceled. At the same time, if any of these concurrent operations returns an error, ErrGroup will capture and return this error.

To use ErrGroup, first import the package:

import "golang.org/x/sync/errgroup"

Then, create an instance of ErrGroup:

var g errgroup.Group

After that, you can pass the tasks to ErrGroup in the form of closures and start a new Goroutine by calling the Go method:

g.Go(func() error {
    // Perform a certain task
    // If everything goes well
    return nil
    // If an error occurs
    // return fmt.Errorf("error occurred")
})

Finally, call the Wait method, which will block and wait for all tasks to complete. If any of these tasks return an error, Wait will return that error:

if err := g.Wait(); err != nil {
    // Handle the error
    log.Fatalf("Task execution error: %v", err)
}

2.8.2 Practical Case of ErrGroup

Consider a scenario where we need to concurrently fetch data from three different data sources, and if any of the data sources fails, we want to immediately cancel the other data fetching operations. This task can be easily accomplished using ErrGroup:

package main

import (
    "fmt"
    "golang.org/x/sync/errgroup"
)

func fetchDataFromSource1() error {
    // Simulate fetching data from source 1
    return nil // or return an error to simulate a failure
}

func fetchDataFromSource2() error {
    // Simulate fetching data from source 2
    return nil // or return an error to simulate a failure
}

func fetchDataFromSource3() error {
    // Simulate fetching data from source 3
    return nil // or return an error to simulate a failure
}

func main() {
    var g errgroup.Group

    g.Go(fetchDataFromSource1)
    g.Go(fetchDataFromSource2)
    g.Go(fetchDataFromSource3)

    // Wait for all goroutines to complete and collect their errors
    if err := g.Wait(); err != nil {
        fmt.Printf("Error occurred while fetching data: %v\n", err)
        return
    }

    fmt.Println("All data fetched successfully!")
}

In this example, the fetchDataFromSource1, fetchDataFromSource2, and fetchDataFromSource3 functions simulate fetching data from different data sources. They are passed to the g.Go method and executed in separate Goroutines. If any of the functions return an error, g.Wait will immediately return that error, allowing appropriate error handling when the error occurs. If all functions execute successfully, g.Wait will return nil, indicating that all tasks have been completed successfully.

Another important feature of ErrGroup is that if any of the Goroutines panics, it will attempt to recover this panic and return it as an error. This helps prevent other concurrently running Goroutines from failing to shut down gracefully. Of course, if you want the tasks to respond to external cancel signals, you can combine the errgroup's WithContext function with the context package to provide a cancellable context.

In this way, ErrGroup becomes a very practical synchronization and error handling mechanism in Go's concurrent programming practice.