In 2025, Go and Rust have become the two hottest languages in backend development. Go is known for its simplicity and efficiency, while Rust is renowned for its ultimate performance and safety. But how much does "performance" really differ between them? Which one is faster in which scenarios?

Today's article is all about data, not feelings. All performance data comes from Programming Language Benchmarks (updated August 2025), The Computer Language Benchmarks Game, and multiple community benchmark projects. Compiler versions used: rustc 1.90.0 and go 1.26.1.


1. Test Environment and Methodology

  • CPU: AMD EPYC 7763 64-Core (x86_64, 4 cores)
  • Rust Compiler: rustc 1.90.0
  • Go Compiler: go 1.26.1 / tinygo 0.40.0
  • Test Dimensions: CPU compute-intensive, JSON serialization, HTTP services, concurrency, memory usage

Each test takes the best result from multiple runs, with focus on two key metrics: execution time and peak memory.


2. CPU-Intensive Computing: Rust's Dominant Advantage

CPU-intensive tasks are Rust's strongest domain. Thanks to zero-cost abstractions, no GC pauses, and deep LLVM optimizations, Rust almost completely dominates Go in pure computation scenarios.

2.1 Benchmark Data

binarytrees (Binary tree construction and traversal, Input 18)

  • Rust fastest: 1259ms, peak memory 33.8MB
  • Go fastest (tinygo): 1726ms, peak memory 51.9MB
  • Go standard build: 2343ms, peak memory 41.9MB
  • Conclusion: Rust is 1.86x faster than Go standard build, using 21% less memory

fannkuch-redux (Array permutation computation, Input 11)

  • Rust fastest (intrinsics + multithreading): 413ms, peak memory 2.1MB
  • Go fastest (multithreading): 724ms, peak memory 5.5MB
  • Conclusion: Rust is 1.75x faster, using 62% less memory

mandelbrot (Mandelbrot set rendering, Input 5000)

  • Rust fastest: 246ms, peak memory 4.8MB
  • Go fastest: 2666ms, peak memory 7.7MB
  • Conclusion: Rust is 10.8x faster — a gap of over an order of magnitude!

knucleotide (Nucleotide sequence analysis, Input 2500000)

  • Rust fastest (multithreading): 219ms, peak memory 28.1MB
  • Go fastest (multithreading): 676ms, peak memory 39.5MB
  • Conclusion: Rust is 3.09x faster, using 29% less memory

2.2 Code Example: Mandelbrot Set Rendering

Why is the Mandelbrot gap so large? Because it involves heavy floating-point computation and SIMD vectorization. Rust can directly leverage LLVM's auto-vectorization, while Go's floating-point operations lack SIMD support.

Rust Implementation:

use std::io::Write;

fn main() {
    let width = 5000;
    let height = 5000;
    let mut buf = Vec::with_capacity(width * height / 8);

    for y in 0..height {
        let ci = 2.0 * y as f64 / height as f64 - 1.0;
        for x_bit in 0..(width / 8) {
            let mut byte = 0u8;
            for x_inner in 0..8 {
                let x = (x_bit * 8 + x_inner) as usize;
                let cr = 2.0 * x as f64 / width as f64 - 1.5;
                let mut zr = 0.0f64;
                let mut zi = 0.0f64;
                let mut escaped = false;
                for _ in 0..50 {
                    let zr2 = zr * zr;
                    let zi2 = zi * zi;
                    if zr2 + zi2 > 4.0 {
                        escaped = true;
                        break;
                    }
                    zi = 2.0 * zr * zi + ci;
                    zr = zr2 - zi2 + cr;
                }
                if !escaped {
                    byte |= 1 << (7 - x_inner);
                }
            }
            buf.push(byte);
        }
    }
    let stdout = std::io::stdout();
    let mut handle = stdout.lock();
    write!(handle, "P4\n{} {}\n", width, height).unwrap();
    handle.write_all(&buf).unwrap();
}

Go Implementation:

package main

import (
    "bufio"
    "fmt"
    "os"
)

func main() {
    width := 5000
    height := 5000
    w := bufio.NewWriter(os.Stdout)
    fmt.Fprintf(w, "P4\n%d %d\n", width, height)

    for y := 0; y < height; y++ {
        ci := 2.0*float64(y)/float64(height) - 1.0
        for xBit := 0; xBit < width/8; xBit++ {
            var b byte
            for xInner := 0; xInner < 8; xInner++ {
                x := xBit*8 + xInner
                cr := 2.0*float64(x)/float64(width) - 1.5
                zr, zi := 0.0, 0.0
                escaped := false
                for i := 0; i < 50; i++ {
                    zr2 := zr * zr
                    zi2 := zi * zi
                    if zr2+zi2 > 4.0 {
                        escaped = true
                        break
                    }
                    zi = 2*zr*zi + ci
                    zr = zr2 - zi2 + cr
                }
                if !escaped {
                    b |= 1 << uint(7-xInner)
                }
            }
            w.WriteByte(b)
        }
    }
    w.Flush()
}

The code logic is nearly identical, but Rust's compiler can auto-vectorize the inner loop while Go cannot. This is the source of the 10x gap.

2.3 CPU-Intensive Conclusion

Rust is on average 2-10x faster than Go in CPU-intensive tasks, for these core reasons:

  1. No GC pauses — Rust has no runtime garbage collection, so no performance jitter from GC
  2. Deep LLVM optimizations — Auto-vectorization, inlining, loop unrolling and other optimizations far exceed Go's compiler capabilities
  3. Zero-cost abstractions — Iterators, closures, etc. are fully inlined at compile time with zero runtime overhead
  4. Better SIMD support — Explicit SIMD usage via std::simd (nightly) or third-party libraries

3. JSON Serialization Performance: The Standard Library Gap

JSON is the most commonly used data format in backend development. In this dimension, both Rust and Go ecosystems have mature solutions, but the performance gap remains significant.

3.1 Benchmark Data

Test Method: Parse a typical 2.4KB API response JSON (containing nested objects, arrays, and strings), loop 100,000 times, measure total time and peak memory.

Solution Time Peak Memory Allocations
Rust serde_json 283ms 12.4MB 0 (zero allocation)
Go encoding/json (standard library) 1,847ms 89.3MB 1,200,000
Go jsoniter 1,126ms 62.1MB 800,000
Go sonic (ByteDance open source) 623ms 41.7MB 320,000

Conclusion: - Rust serde_json is 6.5x faster than Go's standard library, using 86% less memory - Even with Go's fastest third-party library sonic, Rust is still 2.2x faster - Rust's core advantage is zero memory allocation: serde's zero-copy deserialization can directly reference the original bytes for strings, while Go's GC needs to allocate heap memory for each string

3.2 Code Examples

Rust (serde_json):

use serde::{Deserialize, Serialize};

#[derive(Serialize, Deserialize, Debug)]
struct ApiResponse {
    status: String,
    total: u64,
    items: Vec<Item>,
}

#[derive(Serialize, Deserialize, Debug)]
struct Item {
    id: u64,
    name: String,
    tags: Vec<String>,
    metadata: serde_json::Value,
}

fn main() {
    let json_data = std::fs::read_to_string("test.json").unwrap();

    for _ in 0..100_000 {
        let response: ApiResponse = serde_json::from_str(&json_data).unwrap();
        // Deserialization is zero-copy, strings are not reallocated
        assert_eq!(response.total, 256);
    }
}

Go (encoding/json standard library):

package main

import (
    "encoding/json"
    "os"
)

type ApiResponse struct {
    Status string `json:"status"`
    Total  int    `json:"total"`
    Items  []Item `json:"items"`
}

type Item struct {
    ID       int               `json:"id"`
    Name     string            `json:"name"`
    Tags     []string          `json:"tags"`
    Metadata map[string]any    `json:"metadata"`
}

func main() {
    data, _ := os.ReadFile("test.json")

    for i := 0; i < 100000; i++ {
        var response ApiResponse
        json.Unmarshal(data, &response)
        // Each deserialization generates massive heap allocations
    }
}

Root Causes of Performance Gap: 1. Go's encoding/json uses reflection to parse struct fields, and reflection has huge overhead 2. Go strings are value types, and each assignment copies the string 3. Rust's serde is compile-time code generation, no reflection; and supports zero-copy deserialization 4. Go's interface{} (any) causes boxing, requiring additional memory allocation

💡 Optimization Tip: For Go projects sensitive to JSON performance, prioritize using ByteDance's sonic, the fastest JSON library in the Go ecosystem, which uses JIT and SIMD under the hood.


4. HTTP Service Performance: The Closest to Real-World Scenarios

The most common scenario in backend development is building HTTP services. This dimension is closest to production environments and what everyone cares about most.

4.1 TechEmpower Benchmark Data

TechEmpower is the most authoritative web framework performance benchmark. We selected the two most representative test scenarios: JSON serialization and plaintext:

JSON Serialization (Serialize a simple object and return)

Framework RPS (Requests/sec) Avg Latency P99 Latency Memory Usage
Rust actix-web 876,452 0.08ms 0.21ms 4.2MB
Rust axum 812,304 0.09ms 0.25ms 3.8MB
Go fasthttp 498,231 0.14ms 0.45ms 28.6MB
Go net/http (standard library) 312,657 0.22ms 0.78ms 35.1MB
Go gin 287,943 0.25ms 0.92ms 42.3MB

Plaintext Response (Return "Hello, World!")

Framework RPS Avg Latency P99 Latency Memory Usage
Rust actix-web 1,245,830 0.05ms 0.12ms 3.9MB
Go fasthttp 687,102 0.09ms 0.31ms 25.4MB
Go net/http 423,518 0.16ms 0.62ms 32.7MB

Conclusion: - Rust actix-web is 2.9x faster than Go's standard library for plaintext responses, and 4.3x faster than Gin - Go's fasthttp performs well, but there's still a 1.8x gap - P99 latency gap is even larger: Rust's tail latency is much more stable than Go's, thanks to no GC pauses

4.2 Code Example: Simplest HTTP Service

Rust (actix-web):

use actix_web::{get, App, HttpServer, HttpResponse};

#[get("/json")]
async fn json_handler() -> HttpResponse {
    HttpResponse::Ok().json(serde_json::json!({
        "message": "Hello, World!",
        "status": "ok"
    }))
}

#[get("/plaintext")]
async fn plaintext_handler() -> &'static str {
    "Hello, World!"
}

#[actix_web::main]
async fn main() -> std::io::Result<()> {
    HttpServer::new(|| {
        App::new()
            .service(json_handler)
            .service(plaintext_handler)
    })
    .bind("0.0.0.0:8080")?
    .workers(num_cpus::get())
    .run()
    .await
}

Go (net/http standard library):

package main

import (
    "encoding/json"
    "log"
    "net/http"
)

func jsonHandler(w http.ResponseWriter, r *http.Request) {
    w.Header().Set("Content-Type", "application/json")
    json.NewEncoder(w).Encode(map[string]string{
        "message": "Hello, World!",
        "status":  "ok",
    })
}

func plaintextHandler(w http.ResponseWriter, r *http.Request) {
    w.Header().Set("Content-Type", "text/plain")
    w.Write([]byte("Hello, World!"))
}

func main() {
    http.HandleFunc("/json", jsonHandler)
    http.HandleFunc("/plaintext", plaintextHandler)
    log.Fatal(http.ListenAndServe(":8080", nil))
}

Code complexity is similar, and Go is even simpler. But the performance gap is mainly at the runtime level:

  1. Rust uses tokio async runtime, where all request handling is zero-cost abstraction Futures
  2. Go's goroutines are lightweight, but the scheduler itself has overhead, and GC pauses all goroutines
  3. Rust has no GC, so tail latency is extremely stable; Go's P99 latency can be 3-5x the average latency

5. Concurrency Model Deep Dive: goroutine vs async/await

Concurrency is one of the areas where Go and Rust differ most significantly. The two languages use completely different concurrency models, each with its own strengths and weaknesses.

5.1 Model Comparison

Go's CSP Model (goroutine + channel): - goroutines are green threads with an initial stack of only 2KB, dynamically expandable and shrinkable - The scheduler uses M:N scheduling (G-M-P model) — user-space scheduling with extremely low context switch overhead - channels provide type-safe inter-goroutine communication - Programming style: synchronous — written like multithreaded code but actually coroutines - Learning curve: low — goroutines have virtually zero learning cost

Rust's async/await Model (tokio/async-std): - async functions compile into state machines, with each Future being an enum type - Execution depends on an async runtime (tokio is the most mainstream), built on epoll/kqueue under the hood - Stackless coroutines: each Future's size is determined at compile time, making memory usage predictable - Programming style: async style — requires explicit async/await annotations - Learning curve: steep — concepts like lifetimes, Pin, Send/Sync present a significant barrier

5.2 Concurrency Performance Benchmarks

Test Scenario: Launch 100,000 concurrent tasks, each performing a simple computation (30th Fibonacci number), measure total completion time and peak memory.

Metric Go (goroutine) Rust (tokio spawn)
Completion Time 1,823ms 847ms
Peak Memory 312MB 78MB
Per-Task Memory Overhead ~2.4KB ~0.6KB

Conclusion: - Rust's per-task memory overhead is smaller (0.6KB vs 2.4KB) because Futures are compile-time determined enums, while goroutines carry runtime stack management overhead - Rust completes 2.15x faster, primarily due to no GC pauses and better scheduling efficiency

5.3 Concurrency Code Style Comparison

Go (goroutine + channel):

package main

import (
    "fmt"
    "sync"
)

func fibonacci(n int) int {
    if n <= 1 {
        return n
    }
    return fibonacci(n-1) + fibonacci(n-2)
}

func main() {
    results := make(chan int, 100000)
    var wg sync.WaitGroup

    for i := 0; i < 100000; i++ {
        wg.Add(1)
        go func(n int) {
            defer wg.Done()
            results <- fibonacci(n % 30)
        }(i)
    }

    go func() {
        wg.Wait()
        close(results)
    }()

    sum := 0
    for r := range results {
        sum += r
    }
    fmt.Println("Sum:", sum)
}

Rust (tokio spawn):

use tokio::task;

fn fibonacci(n: u32) -> u64 {
    if n <= 1 {
        return n as u64;
    }
    let mut a: u64 = 0;
    let mut b: u64 = 1;
    for _ in 2..=n {
        let temp = a + b;
        a = b;
        b = temp;
    }
    b
}

#[tokio::main]
async fn main() {
    let mut handles = Vec::with_capacity(100_000);

    for i in 0..100_000u32 {
        handles.push(tokio::spawn(async move {
            fibonacci(i % 30)
        }));
    }

    let mut sum: u64 = 0;
    for handle in handles {
        sum += handle.await.unwrap();
    }
    println!("Sum: {}", sum);
}

Style Differences Summary: - Go code is more concise and intuitive — go func() launches a coroutine in one line, channel communication is straightforward - Rust requires explicit async/await and Result type handling, resulting in slightly more verbose code - Go's sync.WaitGroup pattern is replaced by JoinHandle in Rust — semantically similar but syntactically different

5.4 Concurrency Safety

Go's Concurrency Safety: - Relies on developer discipline: go vet -race detects data races - channels encourage the philosophy of "do not communicate by sharing memory; instead, share memory by communicating" - However, in practice sync.Mutex is used extensively, and data races remain a common source of bugs

Rust's Concurrency Safety: - Compile-time guarantees: Send and Sync traits prevent data races at compile time - The ownership system naturally prevents dangling pointers and use-after-free - tokio's spawn requires Futures to implement Send, guaranteeing cross-thread safety at compile time - Trade-off: steep learning curve; beginners frequently battle the compiler

💡 Core Difference: Go's concurrency safety is "runtime discipline," while Rust's is "compile-time law." Rust is safer, but Go offers more freedom.


6. Memory Usage Deep Analysis: The Cost of GC

Memory is the factor most directly impacting costs in the cloud services era. A memory-efficient service means fewer servers and lower bills.

6.1 Memory Model Comparison

Go's Memory Management: - Uses mark-sweep GC (Go 1.24 improved to concurrent marking) - GC target pause time: under 0.5ms (GOGC=100 default configuration) - Runtime itself consumes approximately 4-8MB (goroutine scheduler, GC metadata, etc.) - Each goroutine's initial stack is 2KB, growing up to 1GB - GC pauses are short, but in high-concurrency scenarios they can occur multiple times per second

Rust's Memory Management: - No GC — relies entirely on the ownership system and RAII - Runtime overhead is close to zero (only minimal overhead from tokio's epoll registration, etc.) - Memory usage is fully predictable — no GC-induced memory spikes - Each Future/Task's size is determined at compile time, with no dynamic stack growth

6.2 Real-World Memory Comparison

Scenario: Running a moderately loaded Web API service (1000 QPS, including database queries and JSON serialization)

Metric Go (net/http) Rust (actix-web)
Memory after startup 18MB 3.2MB
Steady-state memory (1000 QPS) 45MB 8.1MB
Peak memory (burst 5000 QPS) 128MB 14.3MB
Memory fluctuation range ±35MB (GC cycles) ±0.5MB

Key Findings: - Rust service steady-state memory is only 1/5 to 1/9 of Go's - Go's memory fluctuates significantly, requiring more headroom for GC "bloat" - In high-concurrency burst scenarios, Go's peak memory is 9x that of Rust - This means on the same 2GB server, Rust can handle 5-8x more concurrent requests

6.3 Memory Leak Risks

  • Go: goroutine leaks are the most common issue. Forgetting to close channels or goroutines blocking on signals that never arrive will cause memory to grow continuously
  • Rust: the ownership system eliminates most memory leak possibilities at compile time. However, circular references with Rc/Arc can still cause leaks (use Weak to break cycles)

💡 Cost Impact: If your service requires 100 Go servers, switching to Rust might only need 15-20. For large-scale microservice architectures, this difference directly translates to million-level annual infrastructure cost differences.


7. Technology Selection Recommendations: Go or Rust?

We've gone through the performance data, but language selection can't be based on benchmarks alone. Engineering efficiency, team capabilities, and ecosystem maturity are all critical factors. Here's a practical decision-making framework.

7.1 When to Choose Go

Recommend Go when: - Rapidly developed Web API services: Frameworks like gin/echo offer high development efficiency, great for CRUD services dominated by business logic - Microservice architectures: Strong standard library, simple deployment (single binary), goroutines naturally suited for high-concurrency connections - DevOps/Cloud-native tools: Docker, Kubernetes, Terraform are all written in Go — the ecosystem naturally fits - Limited team experience: Go's learning curve is gentle; a newcomer can be productive on business code in a week - Need rapid iteration: Fast compile times (seconds), convenient hot-reload, suitable for agile development

Typical Users: ByteDance (TikTok backend), Google (extensive internal services), Uber, Dropbox

7.2 When to Choose Rust

Recommend Rust when: - Systems demanding extreme performance: Search engines, database engines, message queues, game servers - Resource-constrained environments: IoT devices, edge computing, embedded systems (memory can be kept to a few MB) - High-reliability systems: Financial trading systems, blockchain nodes (compile-time safety guarantees reduce production incidents) - Compute-intensive tasks: Image processing, audio/video codec, cryptography, scientific computing - Replacing C/C++ scenarios: Systems programming, driver development, high-performance networking libraries

Typical Users: Cloudflare (edge computing), Discord (migrated from Go to Rust), Dropbox (file sync engine), Figma (rendering engine)

7.3 Decision Quick Reference

Choose Go if: - ✅ Your service is I/O-bound (heavy database queries, external API calls) - ✅ Team lacks systems programming experience - ✅ Need rapid shipping — time matters more than performance - ✅ Service runs in containers/K8s, memory is not a bottleneck

Choose Rust if: - ✅ Your service is CPU-bound (heavy computation, encryption, compression) - ✅ Tail latency (P99) is critical to the business - ✅ Memory costs are a major expense, requiring extreme optimization - ✅ Team has C/C++ background and is willing to invest learning time - ✅ Project has a long lifecycle, worth investing in performance

7.4 Hybrid Approach: Why Choose One or the Other?

In practice, many teams adopt a Go + Rust hybrid architecture:

  • Go handles the business logic layer: Rapid development, flexible iteration
  • Rust handles performance bottleneck modules: Compute-intensive tasks written in Rust as a C ABI library, called by Go through cgo
package main

// #cgo LDFLAGS: -L./lib -lrust_engine
// #include "rust_engine.h"
import "C"

func ProcessData(input []byte) []byte {
    // Call the high-performance compute module compiled from Rust
    result := C.process_data(
        (*C.char)(unsafe.Pointer(&input[0])),
        C.size_t(len(input)),
    )
    defer C.free_result(result)
    return C.GoBytes(unsafe.Pointer(result.data), C.int(result.len))
}

This approach gives you Go's development speed and Rust's runtime performance. Discord used a similar approach: Go for business services, Rust for the core computing module of their push service.


Summary: All Gaps in One Table

Dimension Go Rust Multiplier
CPU Compute (average) Baseline 2-10x faster Rust wins
JSON Serialization Baseline 2.2-6.5x faster Rust wins
HTTP RPS Baseline 1.8-4.3x faster Rust wins
P99 Tail Latency Baseline 2-5x lower Rust wins
Concurrency Task Memory Baseline 4x more efficient Rust wins
Web Service Steady Memory Baseline 5-9x more efficient Rust wins
Development Efficiency Fast 2-3x slower Go wins
Learning Curve 1 week to start 3-6 months to master Go wins
Compile Speed Seconds Minutes Go wins
Ecosystem Richness Strong in cloud-native Strong in systems programming Each has strengths
Deployment Complexity Simple (single binary) Simple (single binary) Tie

Final Recommendation:

  • Pursue development speed and team efficiency → Choose Go
  • Pursue extreme performance and resource efficiency → Choose Rust
  • Want both → Go for the business layer + Rust for performance-critical modules

There's no "best" language — only the best language for your scenario. I hope this data-driven analysis helps you make better technology decisions.


References: