Learning Goroutines

1 Introduction to Goroutines

1.1 Basic Concepts of Concurrency and Parallelism

Concurrency and parallelism are two common concepts in multi-threaded programming. They are used to describe events or program execution that may occur simultaneously.

Concurrency refers to multiple tasks being processed in the same time frame, but only one task is executing at any given time. Tasks rapidly switch between each other, giving the user an illusion of simultaneous execution. Concurrency is suitable for single-core processors.
Parallelism refers to multiple tasks truly executing simultaneously at the same time, which requires support from multi-core processors.

Go language is designed with concurrency in mind as one of its primary objectives. It achieves efficient concurrent programming models through Goroutines and Channels. Go's runtime manages Goroutines, and can schedule these Goroutines on multiple system threads to achieve parallel processing.

1.2 Goroutines in Go Language

Goroutines are the core concept for achieving concurrent programming in Go language. They are lightweight threads managed by Go's runtime. From a user's perspective, they are similar to threads, but consume fewer resources and start more quickly.

Characteristics of Goroutines include:

Lightweight: Goroutines occupy less stack memory compared to traditional threads, and their stack size can dynamically expand or shrink as needed.
Low overhead: The overhead for creating and destroying Goroutines is much lower than that for traditional threads.
Simple communication mechanism: Channels provide a simple and effective communication mechanism between Goroutines.
Non-blocking design: Goroutines do not block other Goroutines from running in certain operations. For example, while one Goroutine is waiting for I/O operations, other Goroutines can continue to execute.

2 Creating and Managing Goroutines

2.1 How to Create a Goroutine

In Go language, you can easily create a Goroutine by using the go keyword. When you prefix a function call with the go keyword, the function will asynchronously execute in a new Goroutine.

Let's look at a simple example:

package main

import (
	"fmt"
	"time"
)

// Define a function to print Hello
func sayHello() {
	fmt.Println("Hello")
}

func main() {
	// Start a new Goroutine using the go keyword
	go sayHello()

	// The main Goroutine waits for a period to allow sayHello to execute
	time.Sleep(1 * time.Second)
	fmt.Println("Main function")
}

In the above code, the sayHello() function will be asynchronously executed in a new Goroutine. This means the main() function will not wait for sayHello() to finish before continuing. Therefore, we use time.Sleep to pause the main Goroutine, allowing the print statement in sayHello to be executed. This is just for demonstration purposes. In actual development, we typically use channels or other synchronization methods to coordinate the execution of different Goroutines.

Note: In practical applications, time.Sleep() should not be used to wait for a Goroutine to finish, as it is not a reliable synchronization mechanism.

2.2 Goroutine Scheduling Mechanism

In Go, the scheduling of Goroutines is handled by the scheduler of the Go runtime, which is responsible for allocating execution time on available logical processors. The Go scheduler uses M:N scheduling technology (multiple Goroutines mapped to multiple OS threads) to achieve better performance on multi-core processors.

GOMAXPROCS and Logical Processors

GOMAXPROCS is an environment variable that defines the maximum number of CPUs available to the runtime scheduler, with the default value being the number of CPU cores on the machine. The Go runtime assigns one OS thread for each logical processor. By setting GOMAXPROCS, we can restrict the number of cores used by the runtime.

import "runtime"

func init() {
    runtime.GOMAXPROCS(2)
}

The above code sets a maximum of two cores to schedule Goroutines, even when running the program on a machine with more cores.

Scheduler Operation

The scheduler operates using three important entities: M (machine), P (processor), and G (Goroutine). M represents a machine or thread, P represents the context of scheduling, and G represents a specific Goroutine.

M: Represents a machine or thread, serving as an abstraction of OS kernel threads.
P: Represents the resources required to execute a Goroutine. Each P has a local Goroutine queue.
G: Represents a Goroutine, including its execution stack, instruction set, and other information.

The working principles of the Go scheduler are:

M must have a P to execute G. If there is no P, M will be returned to the thread cache.
When G is not blocked by other G (e.g., in system calls), it runs on the same M as much as possible, helping to keep G's local data 'hot' for more efficient CPU cache utilization.
When a G is blocked, M and P will separate, and P will look for a new M or wake up a new M to serve other G.

go func() {
    fmt.Println("Hello from Goroutine")
}()

The above code demonstrates starting a new Goroutine, which will prompt the scheduler to add this new G to the queue for execution.

Preemptive Scheduling of Goroutines

In the early stages, Go used cooperative scheduling, meaning that Goroutines could starve other Goroutines if they execute for a long time without voluntarily relinquishing control. Now, the Go scheduler implements preemptive scheduling, allowing long-running Gs to be paused to give other Gs a chance to execute.

2.3 Goroutine Lifecycle Management

To ensure the robustness and performance of your Go application, understanding and properly managing the lifecycle of Goroutines is crucial. Starting Goroutines is simple, but without proper management, they can lead to issues such as memory leaks and race conditions.

Safely Starting Goroutines

Before starting a Goroutine, make sure to understand its workload and runtime characteristics. A Goroutine should have a clear start and end to avoid creating "Goroutine orphans" without termination conditions.

func worker(done chan bool) {
    fmt.Println("Working...")
    time.Sleep(time.Second) // simulate expensive task
    fmt.Println("Done Working.")
    done <- true
}

func main() {
    // Here, the channel mechanism in Go is used. You can simply think of the channel as a basic message queue, and use the "<-" operator to read and write queue data.
    done := make(chan bool, 1)
    go worker(done)
    
    // Wait for the Goroutine to finish
    <-done
}

The above code shows one way to wait for a Goroutine to finish using the done channel.

Note: This example uses the channel mechanism in Go, which will be detailed in later chapters.

Stopping Goroutines

In general, the entire program's end will implicitly terminate all Goroutines. However, in long-running services, we may need to actively stop Goroutines.

Use channels to send stop signals: Goroutines can poll channels to check for stop signals.

stop := make(chan struct{})

go func() {
    for {
        select {
        case <-stop:
            fmt.Println("Got the stop signal. Shutting down...")
            return
        default:
            // execute normal operation
        }
    }
}()

// Send stop signal
stop <- struct{}{}

Use the context package to manage lifecycle:

ctx, cancel := context.WithCancel(context.Background())

go func(ctx context.Context) {
    for {
        select {
        case <-ctx.Done():
            fmt.Println("Got the stop signal. Shutting down...")
            return
        default:
            // execute normal operation
        }
    }
}(ctx)

// when you want to stop the Goroutine
cancel()

Using the context package allows for more flexible control of Goroutines, providing timeout and cancellation capabilities. In large applications or microservices, context is the recommended way to control Goroutine lifecycles.