4 min read
On this page

Memory & GC

Go's garbage collector is concurrent and optimized for low latency. It typically pauses for under a millisecond, even with gigabytes of heap. Understanding how memory works in Go -- stack vs heap, escape analysis, the GC algorithm -- lets you write code that is both fast and memory-efficient without fighting the runtime.

Go's Garbage Collector

Go uses a concurrent, tri-color mark-and-sweep collector:

  1. Mark phase: The GC traces all reachable objects starting from roots (goroutine stacks, globals). Objects are colored white (unvisited), grey (visited, children not yet scanned), or black (visited, all children scanned).
  2. Sweep phase: White objects (unreachable) are freed. Black objects survive.
  3. Concurrent: Most GC work happens concurrently with your application. Only brief stop-the-world pauses occur at the start and end of marking.
Phase           Concurrent?   Duration
-----------------------------------------
Mark setup      STW           < 1ms
Marking         Concurrent    proportional to live heap
Mark cleanup    STW           < 1ms
Sweeping        Concurrent    proportional to freed heap

The key property: GC pause times are proportional to the number of goroutine stacks and globals, not the heap size. A 10GB heap does not mean 10GB of scanning in a pause.

GOGC: Tuning the GC

GOGC controls how often the GC runs. The default is GOGC=100, meaning the GC triggers when the heap grows to 2x the size of the live heap after the last collection.

# GC runs when heap doubles (default)
GOGC=100 ./myapp

# GC runs when heap grows 50% -- more frequent, less memory
GOGC=50 ./myapp

# GC runs when heap triples -- less frequent, more memory
GOGC=200 ./myapp

# Disable GC entirely (for short-lived programs)
GOGC=off ./myapp

GOMEMLIMIT (Go 1.19+)

GOMEMLIMIT sets a soft memory limit. The GC runs more aggressively as memory approaches the limit:

GOMEMLIMIT=1GiB ./myapp

This is better than tuning GOGC directly. Set GOMEMLIMIT to ~80% of your container memory limit:

# Container has 2GB
GOMEMLIMIT=1600MiB ./myapp

The GC will work harder to stay under the limit without needing you to guess the right GOGC value.

Stack vs Heap Allocation

Go has two places to allocate memory:

Stack

  • Fast: just a pointer bump
  • Automatically freed when the function returns
  • No GC involvement
  • Limited to data that does not outlive the function

Heap

  • Slower: requires GC tracking
  • Lives until the GC determines it is unreachable
  • Required for data that escapes the function scope
func stackAllocation() int {
    x := 42        // allocated on the stack
    return x        // value is copied to caller
}

func heapAllocation() *int {
    x := 42        // allocated on the heap
    return &x       // pointer escapes the function
}

In the second function, x must live beyond the function return, so the compiler allocates it on the heap.

Escape Analysis

The compiler decides where to allocate through escape analysis. View its decisions with -gcflags '-m':

go build -gcflags '-m' ./...
./main.go:10:2: x escapes to heap
./main.go:15:2: y does not escape
./main.go:20:9: make([]byte, n) escapes to heap

Common reasons values escape:

// Escapes: returned pointer
func newUser() *User {
    u := User{Name: "Alice"} // escapes to heap
    return &u
}

// Escapes: assigned to interface
func process(v any) { /* ... */ }
func main() {
    x := 42
    process(x) // x escapes because interface{} is a heap allocation
}

// Escapes: closure captures variable
func counter() func() int {
    n := 0 // escapes: captured by closure
    return func() int {
        n++
        return n
    }
}

// Escapes: too large for stack
func bigSlice() {
    data := make([]byte, 10_000_000) // escapes: too big for stack
    _ = data
}

Reducing Allocations

sync.Pool: Reuse Temporary Objects

sync.Pool maintains a pool of reusable objects, reducing GC pressure:

var bufferPool = sync.Pool{
    New: func() any {
        return new(bytes.Buffer)
    },
}

func processRequest(data []byte) string {
    buf := bufferPool.Get().(*bytes.Buffer)
    defer func() {
        buf.Reset()
        bufferPool.Put(buf)
    }()

    buf.Write(data)
    buf.WriteString(" processed")
    return buf.String()
}

Use sync.Pool for objects that are allocated and freed frequently (buffers, encoders, temporary slices). Do not use it for objects with a long lifetime.

Pre-Allocated Slices

// Bad: grows and reallocates multiple times
func collect(n int) []string {
    var result []string
    for i := 0; i < n; i++ {
        result = append(result, fmt.Sprintf("item-%d", i))
    }
    return result
}

// Good: one allocation
func collect(n int) []string {
    result := make([]string, 0, n)
    for i := 0; i < n; i++ {
        result = append(result, fmt.Sprintf("item-%d", i))
    }
    return result
}

When you know the size (or a reasonable upper bound), pre-allocate with make([]T, 0, capacity).

Avoid String Concatenation in Loops

// Bad: O(n^2) allocations
func join(items []string) string {
    result := ""
    for _, item := range items {
        result += item + ", " // allocates a new string each iteration
    }
    return result
}

// Good: O(n) with one allocation
func join(items []string) string {
    var buf strings.Builder
    for i, item := range items {
        if i > 0 {
            buf.WriteString(", ")
        }
        buf.WriteString(item)
    }
    return buf.String()
}

strings.Builder minimizes allocations by growing an internal buffer.

Accept Interfaces, Return Structs

// Allocates: returning an interface forces heap allocation
func NewReader() io.Reader {
    return &myReader{} // escapes to heap
}

// Does not allocate: returning concrete type may stay on stack
func NewReader() *myReader {
    return &myReader{} // may stay on stack if caller does not store in interface
}

Avoid Pointers to Small Values

// Counterintuitive: pointer causes heap allocation
type Config struct {
    Port    *int    // forces heap allocation for the int
    Verbose *bool   // forces heap allocation for the bool
}

// Better for small values: use the value directly
type Config struct {
    Port    int
    Verbose bool
}

Pointers to small types (int, bool, small structs) often cost more than copying the value, because the pointer forces a heap allocation.

When GC Pauses Matter & When They Don't

They Matter

  • Real-time trading systems (sub-millisecond latency requirements)
  • Game servers (frame timing sensitive)
  • Low-latency network proxies

They Usually Do Not Matter

  • Web APIs (network latency dwarfs GC pauses)
  • Batch processing (throughput matters, not latency)
  • CLI tools (run once and exit)

For most Go applications, the GC is not the bottleneck. Profile before tuning.

Monitoring GC in Production

Enable GC logging:

GODEBUG=gctrace=1 ./myapp
gc 1 @0.012s 2%: 0.015+1.2+0.006 ms clock, 0.12+0.8/1.0/0+0.048 ms cpu, 4->4->2 MB, 4 MB goal
  • 0.015+1.2+0.006 ms: STW pause + concurrent mark + STW pause
  • 4->4->2 MB: heap before, heap after mark, live heap
  • 4 MB goal: target heap size for next GC

In code, use runtime.ReadMemStats:

var m runtime.MemStats
runtime.ReadMemStats(&m)
slog.Info("memory",
    "heap_alloc", m.HeapAlloc,
    "heap_sys", m.HeapSys,
    "num_gc", m.NumGC,
    "gc_pause_total", m.PauseTotalNs,
)

Common Pitfalls

  • Tuning GOGC before profiling. The default is fine for most applications. Profile first, tune only if GC is actually a bottleneck.
  • Using sync.Pool for long-lived objects. The pool is cleared on every GC cycle. It is for temporary, frequently-allocated objects only.
  • Assuming pointers are always faster. Pointers to small values cause heap allocations. Passing small structs by value is often faster.
  • Pre-optimizing allocations. Write clear code first. Profile. Optimize only the hot paths.
  • Setting GOMEMLIMIT without headroom. If your container has 2GB, do not set GOMEMLIMIT=2GiB. Leave 20% headroom for the OS and non-Go memory.
  • Ignoring escape analysis output. Run go build -gcflags '-m' on hot paths. One unexpected escape can cause significant allocation overhead.

Key Takeaways

  • Go's GC is concurrent with sub-millisecond pauses. It is not the bottleneck for most applications.
  • GOMEMLIMIT (Go 1.19+) is the preferred tuning knob. Set it to ~80% of container memory.
  • Stack allocation is free. Heap allocation requires GC work. Escape analysis decides which.
  • Run go build -gcflags '-m' to see what escapes to the heap.
  • Reduce allocations with sync.Pool, pre-allocated slices, strings.Builder, and by returning concrete types.
  • Profile before tuning. Most Go applications do not need GC tuning.
  • Use GODEBUG=gctrace=1 and runtime.ReadMemStats to monitor GC behavior in production.