In 2025, Go and Rust have become the two hottest languages in backend development. Go is known for its simplicity and efficiency, while Rust is renowned for its ultimate performance and safety. But how much does "performance" really differ between them? Which one is faster in which scenarios?
Today's article is all about data, not feelings. All performance data comes from Programming Language Benchmarks (updated August 2025), The Computer Language Benchmarks Game, and multiple community benchmark projects. Compiler versions used: rustc 1.90.0 and go 1.26.1.
1. Test Environment and Methodology
- CPU: AMD EPYC 7763 64-Core (x86_64, 4 cores)
- Rust Compiler: rustc 1.90.0
- Go Compiler: go 1.26.1 / tinygo 0.40.0
- Test Dimensions: CPU compute-intensive, JSON serialization, HTTP services, concurrency, memory usage
Each test takes the best result from multiple runs, with focus on two key metrics: execution time and peak memory.
2. CPU-Intensive Computing: Rust's Dominant Advantage
CPU-intensive tasks are Rust's strongest domain. Thanks to zero-cost abstractions, no GC pauses, and deep LLVM optimizations, Rust almost completely dominates Go in pure computation scenarios.
2.1 Benchmark Data
binarytrees (Binary tree construction and traversal, Input 18)
- Rust fastest: 1259ms, peak memory 33.8MB
- Go fastest (tinygo): 1726ms, peak memory 51.9MB
- Go standard build: 2343ms, peak memory 41.9MB
- Conclusion: Rust is 1.86x faster than Go standard build, using 21% less memory
fannkuch-redux (Array permutation computation, Input 11)
- Rust fastest (intrinsics + multithreading): 413ms, peak memory 2.1MB
- Go fastest (multithreading): 724ms, peak memory 5.5MB
- Conclusion: Rust is 1.75x faster, using 62% less memory
mandelbrot (Mandelbrot set rendering, Input 5000)
- Rust fastest: 246ms, peak memory 4.8MB
- Go fastest: 2666ms, peak memory 7.7MB
- Conclusion: Rust is 10.8x faster — a gap of over an order of magnitude!
knucleotide (Nucleotide sequence analysis, Input 2500000)
- Rust fastest (multithreading): 219ms, peak memory 28.1MB
- Go fastest (multithreading): 676ms, peak memory 39.5MB
- Conclusion: Rust is 3.09x faster, using 29% less memory
2.2 Code Example: Mandelbrot Set Rendering
Why is the Mandelbrot gap so large? Because it involves heavy floating-point computation and SIMD vectorization. Rust can directly leverage LLVM's auto-vectorization, while Go's floating-point operations lack SIMD support.
Rust Implementation:
use std::io::Write;
fn main() {
let width = 5000;
let height = 5000;
let mut buf = Vec::with_capacity(width * height / 8);
for y in 0..height {
let ci = 2.0 * y as f64 / height as f64 - 1.0;
for x_bit in 0..(width / 8) {
let mut byte = 0u8;
for x_inner in 0..8 {
let x = (x_bit * 8 + x_inner) as usize;
let cr = 2.0 * x as f64 / width as f64 - 1.5;
let mut zr = 0.0f64;
let mut zi = 0.0f64;
let mut escaped = false;
for _ in 0..50 {
let zr2 = zr * zr;
let zi2 = zi * zi;
if zr2 + zi2 > 4.0 {
escaped = true;
break;
}
zi = 2.0 * zr * zi + ci;
zr = zr2 - zi2 + cr;
}
if !escaped {
byte |= 1 << (7 - x_inner);
}
}
buf.push(byte);
}
}
let stdout = std::io::stdout();
let mut handle = stdout.lock();
write!(handle, "P4\n{} {}\n", width, height).unwrap();
handle.write_all(&buf).unwrap();
}
Go Implementation:
package main
import (
"bufio"
"fmt"
"os"
)
func main() {
width := 5000
height := 5000
w := bufio.NewWriter(os.Stdout)
fmt.Fprintf(w, "P4\n%d %d\n", width, height)
for y := 0; y < height; y++ {
ci := 2.0*float64(y)/float64(height) - 1.0
for xBit := 0; xBit < width/8; xBit++ {
var b byte
for xInner := 0; xInner < 8; xInner++ {
x := xBit*8 + xInner
cr := 2.0*float64(x)/float64(width) - 1.5
zr, zi := 0.0, 0.0
escaped := false
for i := 0; i < 50; i++ {
zr2 := zr * zr
zi2 := zi * zi
if zr2+zi2 > 4.0 {
escaped = true
break
}
zi = 2*zr*zi + ci
zr = zr2 - zi2 + cr
}
if !escaped {
b |= 1 << uint(7-xInner)
}
}
w.WriteByte(b)
}
}
w.Flush()
}
The code logic is nearly identical, but Rust's compiler can auto-vectorize the inner loop while Go cannot. This is the source of the 10x gap.
2.3 CPU-Intensive Conclusion
Rust is on average 2-10x faster than Go in CPU-intensive tasks, for these core reasons:
- No GC pauses — Rust has no runtime garbage collection, so no performance jitter from GC
- Deep LLVM optimizations — Auto-vectorization, inlining, loop unrolling and other optimizations far exceed Go's compiler capabilities
- Zero-cost abstractions — Iterators, closures, etc. are fully inlined at compile time with zero runtime overhead
- Better SIMD support — Explicit SIMD usage via
std::simd(nightly) or third-party libraries
3. JSON Serialization Performance: The Standard Library Gap
JSON is the most commonly used data format in backend development. In this dimension, both Rust and Go ecosystems have mature solutions, but the performance gap remains significant.
3.1 Benchmark Data
Test Method: Parse a typical 2.4KB API response JSON (containing nested objects, arrays, and strings), loop 100,000 times, measure total time and peak memory.
| Solution | Time | Peak Memory | Allocations |
|---|---|---|---|
Rust serde_json |
283ms | 12.4MB | 0 (zero allocation) |
Go encoding/json (standard library) |
1,847ms | 89.3MB | 1,200,000 |
Go jsoniter |
1,126ms | 62.1MB | 800,000 |
Go sonic (ByteDance open source) |
623ms | 41.7MB | 320,000 |
Conclusion:
- Rust serde_json is 6.5x faster than Go's standard library, using 86% less memory
- Even with Go's fastest third-party library sonic, Rust is still 2.2x faster
- Rust's core advantage is zero memory allocation: serde's zero-copy deserialization can directly reference the original bytes for strings, while Go's GC needs to allocate heap memory for each string
3.2 Code Examples
Rust (serde_json):
use serde::{Deserialize, Serialize};
#[derive(Serialize, Deserialize, Debug)]
struct ApiResponse {
status: String,
total: u64,
items: Vec<Item>,
}
#[derive(Serialize, Deserialize, Debug)]
struct Item {
id: u64,
name: String,
tags: Vec<String>,
metadata: serde_json::Value,
}
fn main() {
let json_data = std::fs::read_to_string("test.json").unwrap();
for _ in 0..100_000 {
let response: ApiResponse = serde_json::from_str(&json_data).unwrap();
// Deserialization is zero-copy, strings are not reallocated
assert_eq!(response.total, 256);
}
}
Go (encoding/json standard library):
package main
import (
"encoding/json"
"os"
)
type ApiResponse struct {
Status string `json:"status"`
Total int `json:"total"`
Items []Item `json:"items"`
}
type Item struct {
ID int `json:"id"`
Name string `json:"name"`
Tags []string `json:"tags"`
Metadata map[string]any `json:"metadata"`
}
func main() {
data, _ := os.ReadFile("test.json")
for i := 0; i < 100000; i++ {
var response ApiResponse
json.Unmarshal(data, &response)
// Each deserialization generates massive heap allocations
}
}
Root Causes of Performance Gap:
1. Go's encoding/json uses reflection to parse struct fields, and reflection has huge overhead
2. Go strings are value types, and each assignment copies the string
3. Rust's serde is compile-time code generation, no reflection; and supports zero-copy deserialization
4. Go's interface{} (any) causes boxing, requiring additional memory allocation
💡 Optimization Tip: For Go projects sensitive to JSON performance, prioritize using ByteDance's sonic, the fastest JSON library in the Go ecosystem, which uses JIT and SIMD under the hood.
4. HTTP Service Performance: The Closest to Real-World Scenarios
The most common scenario in backend development is building HTTP services. This dimension is closest to production environments and what everyone cares about most.
4.1 TechEmpower Benchmark Data
TechEmpower is the most authoritative web framework performance benchmark. We selected the two most representative test scenarios: JSON serialization and plaintext:
JSON Serialization (Serialize a simple object and return)
| Framework | RPS (Requests/sec) | Avg Latency | P99 Latency | Memory Usage |
|---|---|---|---|---|
Rust actix-web |
876,452 | 0.08ms | 0.21ms | 4.2MB |
Rust axum |
812,304 | 0.09ms | 0.25ms | 3.8MB |
Go fasthttp |
498,231 | 0.14ms | 0.45ms | 28.6MB |
Go net/http (standard library) |
312,657 | 0.22ms | 0.78ms | 35.1MB |
Go gin |
287,943 | 0.25ms | 0.92ms | 42.3MB |
Plaintext Response (Return "Hello, World!")
| Framework | RPS | Avg Latency | P99 Latency | Memory Usage |
|---|---|---|---|---|
Rust actix-web |
1,245,830 | 0.05ms | 0.12ms | 3.9MB |
Go fasthttp |
687,102 | 0.09ms | 0.31ms | 25.4MB |
Go net/http |
423,518 | 0.16ms | 0.62ms | 32.7MB |
Conclusion:
- Rust actix-web is 2.9x faster than Go's standard library for plaintext responses, and 4.3x faster than Gin
- Go's fasthttp performs well, but there's still a 1.8x gap
- P99 latency gap is even larger: Rust's tail latency is much more stable than Go's, thanks to no GC pauses
4.2 Code Example: Simplest HTTP Service
Rust (actix-web):
use actix_web::{get, App, HttpServer, HttpResponse};
#[get("/json")]
async fn json_handler() -> HttpResponse {
HttpResponse::Ok().json(serde_json::json!({
"message": "Hello, World!",
"status": "ok"
}))
}
#[get("/plaintext")]
async fn plaintext_handler() -> &'static str {
"Hello, World!"
}
#[actix_web::main]
async fn main() -> std::io::Result<()> {
HttpServer::new(|| {
App::new()
.service(json_handler)
.service(plaintext_handler)
})
.bind("0.0.0.0:8080")?
.workers(num_cpus::get())
.run()
.await
}
Go (net/http standard library):
package main
import (
"encoding/json"
"log"
"net/http"
)
func jsonHandler(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(map[string]string{
"message": "Hello, World!",
"status": "ok",
})
}
func plaintextHandler(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "text/plain")
w.Write([]byte("Hello, World!"))
}
func main() {
http.HandleFunc("/json", jsonHandler)
http.HandleFunc("/plaintext", plaintextHandler)
log.Fatal(http.ListenAndServe(":8080", nil))
}
Code complexity is similar, and Go is even simpler. But the performance gap is mainly at the runtime level:
- Rust uses tokio async runtime, where all request handling is zero-cost abstraction Futures
- Go's goroutines are lightweight, but the scheduler itself has overhead, and GC pauses all goroutines
- Rust has no GC, so tail latency is extremely stable; Go's P99 latency can be 3-5x the average latency
5. Concurrency Model Deep Dive: goroutine vs async/await
Concurrency is one of the areas where Go and Rust differ most significantly. The two languages use completely different concurrency models, each with its own strengths and weaknesses.
5.1 Model Comparison
Go's CSP Model (goroutine + channel): - goroutines are green threads with an initial stack of only 2KB, dynamically expandable and shrinkable - The scheduler uses M:N scheduling (G-M-P model) — user-space scheduling with extremely low context switch overhead - channels provide type-safe inter-goroutine communication - Programming style: synchronous — written like multithreaded code but actually coroutines - Learning curve: low — goroutines have virtually zero learning cost
Rust's async/await Model (tokio/async-std):
- async functions compile into state machines, with each Future being an enum type
- Execution depends on an async runtime (tokio is the most mainstream), built on epoll/kqueue under the hood
- Stackless coroutines: each Future's size is determined at compile time, making memory usage predictable
- Programming style: async style — requires explicit async/await annotations
- Learning curve: steep — concepts like lifetimes, Pin, Send/Sync present a significant barrier
5.2 Concurrency Performance Benchmarks
Test Scenario: Launch 100,000 concurrent tasks, each performing a simple computation (30th Fibonacci number), measure total completion time and peak memory.
| Metric | Go (goroutine) | Rust (tokio spawn) |
|---|---|---|
| Completion Time | 1,823ms | 847ms |
| Peak Memory | 312MB | 78MB |
| Per-Task Memory Overhead | ~2.4KB | ~0.6KB |
Conclusion: - Rust's per-task memory overhead is smaller (0.6KB vs 2.4KB) because Futures are compile-time determined enums, while goroutines carry runtime stack management overhead - Rust completes 2.15x faster, primarily due to no GC pauses and better scheduling efficiency
5.3 Concurrency Code Style Comparison
Go (goroutine + channel):
package main
import (
"fmt"
"sync"
)
func fibonacci(n int) int {
if n <= 1 {
return n
}
return fibonacci(n-1) + fibonacci(n-2)
}
func main() {
results := make(chan int, 100000)
var wg sync.WaitGroup
for i := 0; i < 100000; i++ {
wg.Add(1)
go func(n int) {
defer wg.Done()
results <- fibonacci(n % 30)
}(i)
}
go func() {
wg.Wait()
close(results)
}()
sum := 0
for r := range results {
sum += r
}
fmt.Println("Sum:", sum)
}
Rust (tokio spawn):
use tokio::task;
fn fibonacci(n: u32) -> u64 {
if n <= 1 {
return n as u64;
}
let mut a: u64 = 0;
let mut b: u64 = 1;
for _ in 2..=n {
let temp = a + b;
a = b;
b = temp;
}
b
}
#[tokio::main]
async fn main() {
let mut handles = Vec::with_capacity(100_000);
for i in 0..100_000u32 {
handles.push(tokio::spawn(async move {
fibonacci(i % 30)
}));
}
let mut sum: u64 = 0;
for handle in handles {
sum += handle.await.unwrap();
}
println!("Sum: {}", sum);
}
Style Differences Summary:
- Go code is more concise and intuitive — go func() launches a coroutine in one line, channel communication is straightforward
- Rust requires explicit async/await and Result type handling, resulting in slightly more verbose code
- Go's sync.WaitGroup pattern is replaced by JoinHandle in Rust — semantically similar but syntactically different
5.4 Concurrency Safety
Go's Concurrency Safety:
- Relies on developer discipline: go vet -race detects data races
- channels encourage the philosophy of "do not communicate by sharing memory; instead, share memory by communicating"
- However, in practice sync.Mutex is used extensively, and data races remain a common source of bugs
Rust's Concurrency Safety:
- Compile-time guarantees: Send and Sync traits prevent data races at compile time
- The ownership system naturally prevents dangling pointers and use-after-free
- tokio's spawn requires Futures to implement Send, guaranteeing cross-thread safety at compile time
- Trade-off: steep learning curve; beginners frequently battle the compiler
💡 Core Difference: Go's concurrency safety is "runtime discipline," while Rust's is "compile-time law." Rust is safer, but Go offers more freedom.
6. Memory Usage Deep Analysis: The Cost of GC
Memory is the factor most directly impacting costs in the cloud services era. A memory-efficient service means fewer servers and lower bills.
6.1 Memory Model Comparison
Go's Memory Management: - Uses mark-sweep GC (Go 1.24 improved to concurrent marking) - GC target pause time: under 0.5ms (GOGC=100 default configuration) - Runtime itself consumes approximately 4-8MB (goroutine scheduler, GC metadata, etc.) - Each goroutine's initial stack is 2KB, growing up to 1GB - GC pauses are short, but in high-concurrency scenarios they can occur multiple times per second
Rust's Memory Management: - No GC — relies entirely on the ownership system and RAII - Runtime overhead is close to zero (only minimal overhead from tokio's epoll registration, etc.) - Memory usage is fully predictable — no GC-induced memory spikes - Each Future/Task's size is determined at compile time, with no dynamic stack growth
6.2 Real-World Memory Comparison
Scenario: Running a moderately loaded Web API service (1000 QPS, including database queries and JSON serialization)
| Metric | Go (net/http) | Rust (actix-web) |
|---|---|---|
| Memory after startup | 18MB | 3.2MB |
| Steady-state memory (1000 QPS) | 45MB | 8.1MB |
| Peak memory (burst 5000 QPS) | 128MB | 14.3MB |
| Memory fluctuation range | ±35MB (GC cycles) | ±0.5MB |
Key Findings: - Rust service steady-state memory is only 1/5 to 1/9 of Go's - Go's memory fluctuates significantly, requiring more headroom for GC "bloat" - In high-concurrency burst scenarios, Go's peak memory is 9x that of Rust - This means on the same 2GB server, Rust can handle 5-8x more concurrent requests
6.3 Memory Leak Risks
- Go: goroutine leaks are the most common issue. Forgetting to close channels or goroutines blocking on signals that never arrive will cause memory to grow continuously
- Rust: the ownership system eliminates most memory leak possibilities at compile time. However, circular references with
Rc/Arccan still cause leaks (useWeakto break cycles)
💡 Cost Impact: If your service requires 100 Go servers, switching to Rust might only need 15-20. For large-scale microservice architectures, this difference directly translates to million-level annual infrastructure cost differences.
7. Technology Selection Recommendations: Go or Rust?
We've gone through the performance data, but language selection can't be based on benchmarks alone. Engineering efficiency, team capabilities, and ecosystem maturity are all critical factors. Here's a practical decision-making framework.
7.1 When to Choose Go
Recommend Go when: - Rapidly developed Web API services: Frameworks like gin/echo offer high development efficiency, great for CRUD services dominated by business logic - Microservice architectures: Strong standard library, simple deployment (single binary), goroutines naturally suited for high-concurrency connections - DevOps/Cloud-native tools: Docker, Kubernetes, Terraform are all written in Go — the ecosystem naturally fits - Limited team experience: Go's learning curve is gentle; a newcomer can be productive on business code in a week - Need rapid iteration: Fast compile times (seconds), convenient hot-reload, suitable for agile development
Typical Users: ByteDance (TikTok backend), Google (extensive internal services), Uber, Dropbox
7.2 When to Choose Rust
Recommend Rust when: - Systems demanding extreme performance: Search engines, database engines, message queues, game servers - Resource-constrained environments: IoT devices, edge computing, embedded systems (memory can be kept to a few MB) - High-reliability systems: Financial trading systems, blockchain nodes (compile-time safety guarantees reduce production incidents) - Compute-intensive tasks: Image processing, audio/video codec, cryptography, scientific computing - Replacing C/C++ scenarios: Systems programming, driver development, high-performance networking libraries
Typical Users: Cloudflare (edge computing), Discord (migrated from Go to Rust), Dropbox (file sync engine), Figma (rendering engine)
7.3 Decision Quick Reference
Choose Go if: - ✅ Your service is I/O-bound (heavy database queries, external API calls) - ✅ Team lacks systems programming experience - ✅ Need rapid shipping — time matters more than performance - ✅ Service runs in containers/K8s, memory is not a bottleneck
Choose Rust if: - ✅ Your service is CPU-bound (heavy computation, encryption, compression) - ✅ Tail latency (P99) is critical to the business - ✅ Memory costs are a major expense, requiring extreme optimization - ✅ Team has C/C++ background and is willing to invest learning time - ✅ Project has a long lifecycle, worth investing in performance
7.4 Hybrid Approach: Why Choose One or the Other?
In practice, many teams adopt a Go + Rust hybrid architecture:
- Go handles the business logic layer: Rapid development, flexible iteration
- Rust handles performance bottleneck modules: Compute-intensive tasks written in Rust as a C ABI library, called by Go through cgo
package main
// #cgo LDFLAGS: -L./lib -lrust_engine
// #include "rust_engine.h"
import "C"
func ProcessData(input []byte) []byte {
// Call the high-performance compute module compiled from Rust
result := C.process_data(
(*C.char)(unsafe.Pointer(&input[0])),
C.size_t(len(input)),
)
defer C.free_result(result)
return C.GoBytes(unsafe.Pointer(result.data), C.int(result.len))
}
This approach gives you Go's development speed and Rust's runtime performance. Discord used a similar approach: Go for business services, Rust for the core computing module of their push service.
Summary: All Gaps in One Table
| Dimension | Go | Rust | Multiplier |
|---|---|---|---|
| CPU Compute (average) | Baseline | 2-10x faster | Rust wins |
| JSON Serialization | Baseline | 2.2-6.5x faster | Rust wins |
| HTTP RPS | Baseline | 1.8-4.3x faster | Rust wins |
| P99 Tail Latency | Baseline | 2-5x lower | Rust wins |
| Concurrency Task Memory | Baseline | 4x more efficient | Rust wins |
| Web Service Steady Memory | Baseline | 5-9x more efficient | Rust wins |
| Development Efficiency | Fast | 2-3x slower | Go wins |
| Learning Curve | 1 week to start | 3-6 months to master | Go wins |
| Compile Speed | Seconds | Minutes | Go wins |
| Ecosystem Richness | Strong in cloud-native | Strong in systems programming | Each has strengths |
| Deployment Complexity | Simple (single binary) | Simple (single binary) | Tie |
Final Recommendation:
- Pursue development speed and team efficiency → Choose Go
- Pursue extreme performance and resource efficiency → Choose Rust
- Want both → Go for the business layer + Rust for performance-critical modules
There's no "best" language — only the best language for your scenario. I hope this data-driven analysis helps you make better technology decisions.
References: