03_TOOLING
Reflection & Tooling
Descriptors make schemas machine-readable. This page covers runtime reflection, compiler plugins, custom options, and validation built on top of the descriptor layer.
Reflection
Dynamic schema inspection and runtime descriptors.
Custom Plugins
Extending protoc to generate custom code and docs.
Proto Extensions
Adding fields to messages from outside their definition file.
Standard Options
Built-in metadata for generated code and runtime behavior.
Custom Options
Attaching domain metadata to schema elements.
Breakage Levels
How compatibility tools classify schema changes.
Deprecation
Safely removing fields without reusing wire IDs.
Validation Lab
Live playground for protovalidate business rules.
Schemas Describing Schemas
When you run the Protobuf compiler (protoc), it doesn't just generate code. It can also output a binary representation of your schema called a FileDescriptorSet.
Fascinatingly, this FileDescriptorSet is itself a Protobuf message! Google defines a schema (descriptor.proto) that describes how to represent .proto files. This means you can use Protobuf tools to read and analyze Protobuf schemas dynamically at runtime.
Why is this useful?
Dynamic Decoding
Tools like this web explorer use descriptors to decode arbitrary binary data without generating static code.
Validation
Complex rule engines (like protovalidate) use descriptors to apply constraints dynamically.
Code Generation
Protoc plugins (the tools that generate your code) receive these descriptors as input. This is THE way that custom code generators are built.
Try editing the schema below to see how the generated FileDescriptorSet changes in real-time.
edition = "2023"; package demo.v1; import "buf/validate/validate.proto"; message User { string id = 1 [json_name = "uid"]; string name = 2; uint32 age = 3 [(buf.validate.field).uint32.lt = 150]; Role role = 4; enum Role { ROLE_UNSPECIFIED = 0; ROLE_USER = 1; ROLE_ADMIN = 2; } }
The protoc (or buf generate) compiler doesn't actually know how to generate code for Go, Java, or TypeScript. Instead, it parses the .proto files and hands the resulting Descriptors to a plugin.
This architecture allows anyone to write a plugin to generate a wide range of outputs, such as documentation, client libraries, or even SQL schemas, from a Protobuf definition. For more information, see the plugin.proto file itself.
// Example: Running a custom plugin
$ protoc --plugin=protoc-gen-custom=./my-plugin \
--custom_opt=log_level=debug,other_flag=true \
--custom_out=./generated \
schema.protoThe compiler starts the plugin program as a subprocess.
- stdin: The compiler passes a binary serialized
CodeGeneratorRequestmessage. - stdout: The plugin must return a binary serialized
CodeGeneratorResponsemessage. The plugin must not modify the filesystem directly; it returns the files to be written in this response. - stderr: Used strictly for logging and errors. Any logging should be disabled by default and controlled by a CLI flag to keep the output clean.
Flags & Parameters: Any options passed via --<plugin>_opt are provided to the plugin in the parameter field of the CodeGeneratorRequest as a single comma-separated string. The plugin is responsible for parsing and splitting this string.
What to generate: The compiler passes many files (including dependencies), but the plugin must only generate code for the files listed in the file_to_generate field of the request.
Required Features: In the CodeGeneratorResponse, you are heavily encouraged to explicitly declare your supported features. Setting supported_features along with minimum_edition and maximum_edition is essentially required, as users cannot compile modern Protobuf Editions using your plugin without them.
Protobuf extensions allow you to declare that a message has a range of field numbers reserved for external usage. Third parties can then define new fields for that message without modifying the original file.
How extension support differs across versions:
proto2: Allows extensions on any message (both user-defined messages and standard options).proto3: Restricts extensions exclusively to option messages (specifically to define custom options; more on that later).- Editions: Restores the ability to extend any message (bringing back general-purpose extensions) while keeping option definitions standard and native.
To use extensions in proto2 or Editions, you must define an extension range in the base message using the extensions keyword. External files can then declare fields targeting that range.
edition = "2023"; message UserProfile { string username = 1; // Declare range of tags reserved for third-party extensions extensions 100 to 199; }
edition = "2023"; import "base.proto"; // Extend the custom UserProfile message directly extend UserProfile { optional string stripe_customer_id = 100; }
Protobuf options control how code is generated and how data is mapped. They are categorized by scope: File, Message, Field, or Service.
option go_package: Defines the Go import path.option java_package: Defines the Java package.option optimize_for = SPEED;: Generates highly optimized (but larger) code. Alternatives:CODE_SIZE,LITE_RUNTIME.[deprecated = true]: Marks a field as deprecated.[json_name = "custom"]: Sets a custom JSON key.
edition = "2023"; option go_package = "github.com/example/v1"; option java_multiple_files = true; option optimize_for = SPEED; message User { string user_id = 1 [json_name = "uid"]; string old_field = 2 [deprecated = true]; }
You can define custom "options" (annotations) to attach metadata to your schema. Common use cases include defining data validation rules (e.g., protovalidate), field-level data classification (e.g., tagging PII), and service-level access control (e.g., defining required roles for RBAC).
These annotations are preserved in the binary descriptors, which makes them accessible to anything that processes your schema. This includes protoc plugins that generate custom code, systems that configure themselves during startup, or dynamic tools that load and inspect schemas on demand via reflection.
Under the hood, custom options are defined by using the extend keyword to target the built-in option descriptor messages (like FieldOptions or MethodOptions).
Available Scopes
Metadata can be attached to any of these points by extending the respective standard descriptor messages.
For more information, see the Editions Custom Options Guide.
edition = "2023"; import "google/protobuf/descriptor.proto"; extend google.protobuf.FieldOptions { bool is_pii = 50001; } extend google.protobuf.MethodOptions { string required_role = 50002; }
edition = "2023"; import "options.proto"; service UserService { rpc GetSensitiveData(GetRequest) returns (GetResponse) { option (required_role) = "ADMIN"; } } message Profile { string ssn = 1 [(is_pii) = true]; }
Not all breaking changes are equal. Tools like Buf categorize breaking changes into four distinct levels of severity.
- WIRE: The most severe level. This includes changing a field number or using an incompatible type (e.g.,
stringtoint32). This causes data corruption during serialization; you should never do this. - WIRE_JSON: Breakage in JSON representation. Renaming a field is safe on the binary wire, but clients expecting the old JSON key will fail. You can mitigate this using the
[json_name="old_name"]annotation. - PACKAGE: Source code breakage at the package level. Changing a type in a wire-compatible way (e.g.,
int32toint64) transmits safely, but when developers update their generated code, their builds will fail until they update their types. - FILE: The strictest level. This ensures source code compatibility down to the individual file level. Moving a message to another file might break code generation that relies on specific file imports.
edition = "2023"; package api.v1; message User { string id = 1; int32 age = 2; string display_name = 3; }
edition = "2023"; package api.v1; message User { // [WIRE] breakage: type changed from string int32 id = 1; // [PACKAGE] breakage: source code type change int64 age = 2; // [WIRE_JSON] breakage: JSON key changed string full_name = 3; }
Protobuf identifies data on the wire using field numbers rather than names, so deleting a field requires careful handling. If a schema has been used in production, older clients or databases may still hold data serialized with those field numbers. You cannot simply remove a field and reuse its number without risking collisions. Instead, you must manage its lifecycle:
- Deprecate: Add
[deprecated = true]. This warns developers in their IDEs (via generated code annotations like@Deprecated) not to use it for new features. - Stop Using: Wait until metrics show zero traffic using the field.
- Reserve: Remove the field entirely and add its number/name to a
reservedblock. This prevents future developers from accidentally reusing the number and corrupting old data that might still be in a database.
message Product { int32 price_cents = 1; }
message Product { int32 price_cents = 1 [deprecated = true]; int64 price_micros = 2; }
message Product { reserved 1, "price_cents"; int64 price_micros = 2; }
The Source of Truth
Protobuf goes beyond simple types. By using extensions, you can augment your schema with rich metadata. A powerful example is protovalidate, which allows you to embed complex business rules directly into your schema using CEL. Try modifying the JSON data below or clicking the example buttons to see the validation rules in action.
Test Data (JSON)
Rules Enforcement
Validation Strategy
By putting validation in the schema, you ensure that every part of your system enforcing the contract applies the exact same rules. This eliminates "validation drift" not just between microservices, but across your entire stack. For instance, you can use the same rules to validate a form on your web frontend (using TypeScript) before the request ever hits your backend (running Go, Java, etc.).