Protocol Buffers scalar field types
Protocol Buffers (protobuf) encodes structured data into a compact binary format defined by a .proto schema. The building blocks are scalar field types. Choosing the right one affects message size and CPU cost because each type maps to a specific wire type and encoding strategy. This reference lists every proto3 scalar with its wire type, default value, JSON representation, and language-native mapping.
How it works
On the wire each field is a (field_number << 3) | wire_type tag followed by the value. There are four wire types:
- VARINT (0) — variable-length integers for
int32,int64,uint32,uint64,sint32,sint64,booland enums. - I64 (1) — fixed 8 bytes for
fixed64,sfixed64,double. - LEN (2) — a length prefix then bytes for
string,bytesand embedded messages. - I32 (5) — fixed 4 bytes for
fixed32,sfixed32,float.
Two encodings matter for signed integers. Plain int32/int64 varints are wasteful for negatives because the sign bit forces the full 10-byte width. The sint32/sint64 types apply ZigZag encoding, mapping -1, 1, -2, 2 to 1, 2, 3, 4 so small magnitudes stay short regardless of sign.
In proto3 every scalar has an implicit default of its zero value, and fields equal to that default are simply omitted from the serialized bytes. The canonical JSON mapping renders 64-bit integers as strings to preserve precision.
Tips and example
Pick types by your data’s distribution:
message Sample {
sint64 temperature_delta = 1; // often negative, small
fixed32 crc = 2; // large, fixed cost wins
string label = 3; // length-delimited
double reading = 4; // I64 wire type
}
A common mistake is using int32 for fields that are frequently negative — switch to sint32 and your messages shrink. Conversely, do not use fixed64 for small counters; a varint uint64 is shorter. Filter the table below to confirm a type’s wire type and default before committing your schema.