Serialization at Scale: Techniques for High-Throughput Systems

In the world of high-performance systems, data serialization is a critical yet often overlooked aspect. Whether you’re building a real-time multiplayer game, a video streaming service, or a distributed database, how you serialize and deserialize data can make or break your system’s performance. Today, we’ll explore this concept through the lens of a real-time multiplayer game, where every millisecond counts.

The Challenge: Real-Time Multiplayer Games Link to heading

Imagine a fast-paced multiplayer game where player states—position, health, actions, and more—are updated 60 times per second. These updates must be broadcast to all connected clients in near real-time. To handle this, the server often uses an asynchronous architecture, buffering updates in a queue before sending them to clients.

But here’s the catch: if the server uses inefficient serialization methods, it can lead to latency spikes, high CPU usage, and memory bloat. Let’s break down the problem and explore a solution.

The Problem: Inefficient Serialization Link to heading

Traditional Approach (JSON) Link to heading

In a typical setup, the server might process player updates like this:

alt text

Deserialize read request metadata for player and game validation
Serialize player state to JSON bytes for temporary storage in a queue
Deserialize JSON back to a struct when reading from the queue
Process the data (ex. validate positions, apply game logic) Reserialize the processed data to JSON for broadcasting to clients

Issues with This Approach Link to heading

4 conversions: Deserialize → Serialize → Deserialize → Reserialize
JSON overhead: Field names like "position" are repeated in every packet, wasting bandwidth
High CPU usage: Parsing JSON and creating temporary objects strains the CPU
Garbage collection pressure: Frequent allocations and deallocations slow down the system

The Solution: Custom Binary Serialization Link to heading

To overcome these challenges, we can design a compact binary format that minimizes conversions and memory overhead. Instead of using JSON, we’ll encode player states directly into bytes, avoiding intermediate data structures.

Step 1: Define the Data Structure Link to heading

Let’s start by defining the structure of a player update:

type PlayerUpdate struct {
  PlayerID  uint32    // 4 bytes
  Position  [3]int16  // 6 bytes (x, y, z as 16-bit integers)
  Health    uint8     // 1 byte
  Action    uint8     // 1 byte (e.g., 0=idle, 1=attacking)
}
// Total size: 12 bytes per update.

This structure is fixed in size, making it easy to encode and decode.

Step 2: Serialize to a Binary Format Link to heading

Instead of JSON, we’ll write raw bytes with a strict layout:

[ PlayerID (4 bytes) | Position (6 bytes) | Health (1 byte) | Action (1 byte) ]

Here’s how we encode the data in Go-like pseudocode:

func (p *PlayerUpdate) Encode() []byte {
  buf := make([]byte, 12)
  binary.BigEndian.PutUint32(buf[0:4], p.PlayerID)
  binary.BigEndian.PutUint16(buf[4:6], uint16(p.Position[0]))
  binary.BigEndian.PutUint16(buf[6:8], uint16(p.Position[1]))
  binary.BigEndian.PutUint16(buf[8:10], uint16(p.Position[2]))
  buf[10] = p.Health
  buf[11] = p.Action
  return buf
}

Step 3: Asynchronous Flow with Memory Reuse Link to heading

Instead of storing JSON in the queue, we’ll store the raw bytes. Here’s the optimized flow:

Incoming Update → Encode to Binary → Save Bytes to Queue → Read Bytes → Process → Send Bytes to Clients

Key Optimizations Link to heading

Zero Deserialization: The server processes data directly from the byte stream (e.g., extract PlayerID without parsing the entire struct).
Memory Pool: Reuse byte buffers to avoid allocations.

var bufferPool = sync.Pool{
  New: func() interface{} { return make([]byte, 12) },
}

func EncodeWithPool(p *PlayerUpdate) []byte {
  buf := bufferPool.Get().([]byte)
  defer bufferPool.Put(buf)
  // Write data to buf...
  return buf
}

Batch Updates: Encode multiple player updates into a single packet with headers:

[Header: Packet Size (4B) | Player Count (2B)] → [Player 1 (12B)] → [Player 2 (12B)] → ...

Step 4: Deserialization (Client Side) Link to heading

Clients receive the binary stream and decode it efficiently:

func DecodeBatch(data []byte) []PlayerUpdate {
  count := int(binary.BigEndian.Uint16(data[4:6]))
  updates := make([]PlayerUpdate, count)
  for i := 0; i < count; i++ {
    offset := 6 + i*12
    updates[i].PlayerID = binary.BigEndian.Uint32(data[offset:offset+4]))
    updates[i].Position[0] = int16(binary.BigEndian.Uint16(data[offset+4:offset+6]))
    // ... decode other fields
  }
  return updates
}

Benefits of Binary Serialization Link to heading

Efficiency: Binary formats are more compact and faster to encode/decode than text-based formats like JSON
Lower Latency: Faster processing means updates reach clients quicker
Reduced Disk I/O: Binary data takes up less space, reducing the load on storage systems

Performance Gains Link to heading

Metric	JSON Approach	Custom Binary
CPU Usage	High (parsing JSON)	Low (direct byte access)
Memory/Update	~50-100 bytes	12 bytes
Latency	10-20ms per batch	1-2ms per batch
GC Pressure	High (many allocations)	Near-zero (pooling)

Key Takeaways Link to heading

Avoid Intermediate Formats: Skip JSON for internal queues. Use raw bytes with a strict layout
Memory Pools: Reuse buffers to reduce allocations and GC pressure
Batch Processing: Encode multiple updates into a single packet to amortize overhead
Direct Byte Access: Extract fields without full deserialization (e.g., read PlayerID directly from bytes 0-4)

Conclusion Link to heading

Efficient serialization is the backbone of high-performance systems. By moving away from text-based formats like JSON and embracing custom binary protocols, you can drastically reduce latency, CPU usage, and memory overhead. Whether you’re building a real-time game, a video streaming service, or a distributed database, these principles will help you scale efficiently.