GoMicroservices
June 13, 2026

Building High-Performance Internal APIs with gRPC in Go

When services communicate inside a distributed system, not every interaction should be asynchronous. Events are excellent when a service can publish a fact and let other systems react later. But many workflows need an immediate answer: find an available courier, confirm a route, fetch assignment details, or ask another service whether a decision is allowed.

For those synchronous service-to-service calls, REST and gRPC are the common choices. REST is simple, open, and widely supported. gRPC is stricter, faster, and built around typed procedures. A mature Go system often uses both: REST for external clients and gRPC for internal calls where the teams control both sides of the contract.

This post walks through a practical way to design, generate, implement, run, and operate a gRPC service in Go without mixing transport code with business logic.

The Problem

Imagine a logistics platform with several internal services:

  • A public API receives requests from a mobile app or web dashboard.
  • A Dispatch service decides which courier should receive a task.
  • A Courier service tracks courier status and availability.
  • A Route service calculates trip and route information.

The public client should not need to know how these internal services work. It wants a stable HTTP API that is easy to call from browsers, mobile apps, and partner systems. Internally, however, the system needs faster, more structured communication.

The internal calls have different requirements:

  • The caller and callee are usually owned by the same organization.
  • Payloads should be compact because calls may happen frequently.
  • Contracts should be strongly typed so mistakes appear at compile time.
  • Timeouts and cancellation must be part of every request.
  • Streaming should be possible for live updates or long-running interactions.
  • Observability must be centralized so every RPC can be measured and traced.

A common architecture uses REST at the outside boundary and gRPC behind it:

Browser or Mobile App
  |
  | REST over HTTP
  v
Backend for Frontend
  |
  | gRPC internal calls
  v
Dispatch Service ---> Courier Service
  |
  v
Route Service

The Backend for Frontend, often called a BFF, shapes the API for a specific client. It accepts external REST requests, calls internal services with gRPC, combines the results, and returns a response the client understands.

What gRPC Gives You

gRPC is a communication framework for calling methods on remote services. It was introduced by Google in 2015 and is widely used in microservice systems. Instead of designing many hand-written HTTP endpoints and exchanging text payloads, you define remote procedures and typed messages in a .proto file.

Two pieces make gRPC work well for internal service communication:

Piece Role
Protocol Buffers Defines messages and encodes them as compact binary data.
HTTP/2 Provides persistent connections, multiplexing, and streaming.

Protocol Buffers, usually shortened to Protobuf, is both a schema language and a binary encoding format. A Protobuf message defines fields with names, types, and numeric tags. The numeric tags are what matter on the wire, which keeps payloads small and helps contracts evolve safely.

HTTP/2 gives gRPC features that are difficult to model cleanly with ordinary REST calls, including bidirectional streaming and efficient connection reuse. This matters when services call each other frequently under load.

REST Outside, gRPC Inside

REST is still a strong choice for public APIs. External clients value simplicity, loose coupling, and human-readable payloads. REST is easy to test with common tools, works naturally with browsers, and does not require client code generation.

gRPC is usually a better fit when the caller and callee are internal services. In that environment, speed, strong contracts, and generated client code are more valuable than human-readable payloads.

Use this decision rule as a starting point:

Situation Better default Reason
Public API for browsers, partners, or mobile clients REST Loose coupling and broad compatibility matter most.
Internal service-to-service communication gRPC Strong typing, compact payloads, and speed matter more.
Real-time server updates gRPC streaming HTTP/2 streams can avoid repeated polling.
Browser calling a gRPC-based system gRPC-Web through a proxy Browsers cannot speak native gRPC directly.

Native gRPC is not a direct browser protocol. Browser APIs hide low-level HTTP/2 details that gRPC needs, so browser integrations use gRPC-Web. gRPC-Web maps gRPC-style calls onto normal browser-compatible HTTP requests, often through a proxy such as Envoy. It is useful, but it does not provide every native gRPC feature. Unary calls are the safest fit, while client streaming and true bidirectional streaming are limited.

Designing the Protobuf Contract

A .proto file is the shared language between services. Treat it as a contract, not as an implementation detail. Once another service depends on it, careless changes can break production.

Here is an adapted example for an internal dispatch API:

syntax = "proto3";

package dispatchpb;

option go_package = "github.com/acme/logistics/internal/grpc/dispatchpb;dispatchpb";

message CourierLookupRequest {
  string courier_id = 1;
}

message CourierStatusReply {
  string courier_id = 1;
  string transport_mode = 2;
  string availability = 3;
}

message AssignmentUpdate {
  string assignment_id = 1;
  string courier_id = 2;
  string destination_zone = 3;
}

message AssignmentResult {
  string assignment_id = 1;
  bool accepted = 2;
}

service DispatchGateway {
  rpc GetCourierStatus(CourierLookupRequest) returns (CourierStatusReply);
  rpc StreamAssignments(stream AssignmentUpdate) returns (stream AssignmentResult);
}

There are several rules worth following from the start:

  1. Use proto3 for new services unless a legacy system requires something else.
  2. Set package to avoid naming conflicts between teams or modules.
  3. Set go_package so generated Go imports are stable.
  4. Use clear request and response message names.
  5. Never reuse a field tag after a field has been published.
  6. Keep messages simple and avoid deeply nested structures unless the domain needs them.

The method names are action-oriented because gRPC describes procedures, not REST resources. A method such as GetCourierStatus says exactly what behavior the caller wants.

Evolving Messages Safely

Adding a new field is usually safer than deleting or reusing a field. If a field is removed, reserve its tag so future developers cannot accidentally assign that number to a different meaning.

message CourierStatusReply {
  reserved 4;
  reserved "legacy_region";

  string courier_id = 1;
  string transport_mode = 2;
  string availability = 3;
  string current_zone = 5;
}

The important idea is simple: field numbers are part of the contract. Names help humans read the schema, but numeric tags protect compatibility on the wire.

Generating Go Code

After defining the .proto file, generate Go types and gRPC service interfaces. The workflow uses the Protobuf compiler, protoc, plus Go plug-ins for message and service generation.

Install the Go plug-ins once:

go install google.golang.org/protobuf/cmd/protoc-gen-go@latest
go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@latest

Make sure your Go binary directory is available on your PATH, because protoc needs to find these plug-ins.

Then generate code from your schema:

protoc \
  --proto_path=api/proto \
  --go_out=internal/grpc \
  --go_opt=paths=source_relative \
  --go-grpc_out=internal/grpc \
  --go-grpc_opt=paths=source_relative \
  api/proto/dispatch.proto

This produces two kinds of generated files:

  • Message types, such as CourierLookupRequest and CourierStatusReply.
  • Service code, including the server interface and client stub.

Do not edit generated files by hand. Change the .proto file, regenerate, and let the generated code reflect the contract.

Keeping Business Logic Out of the gRPC Adapter

The generated service interface is a transport contract. Your implementation should adapt transport messages to domain behavior. It should not become the place where business rules live.

A clean adapter receives domain dependencies through a constructor:

type CourierDirectory interface {
    FindCourier(ctx context.Context, courierID string) (CourierSnapshot, error)
    ReviewAssignment(ctx context.Context, update AssignmentDraft) (bool, error)
}

type CourierSnapshot struct {
    ID            string
    TransportMode string
    Availability  string
}

type AssignmentDraft struct {
    AssignmentID    string
    CourierID       string
    DestinationZone string
}

type DispatchAdapter struct {
    dispatchpb.UnimplementedDispatchGatewayServer
    directory CourierDirectory
}

func NewDispatchAdapter(directory CourierDirectory) *DispatchAdapter {
    return &DispatchAdapter{directory: directory}
}

The adapter depends on an interface, not a concrete database or service implementation. That makes it easy to test. The domain layer can be tested without running a network server, and the gRPC layer can be tested with fakes.

Now implement a unary RPC:

func (a *DispatchAdapter) GetCourierStatus(
    ctx context.Context,
    req *dispatchpb.CourierLookupRequest,
) (*dispatchpb.CourierStatusReply, error) {
    snapshot, err := a.directory.FindCourier(ctx, req.GetCourierId())
    if err != nil {
        return nil, err
    }

    return &dispatchpb.CourierStatusReply{
        CourierId:     snapshot.ID,
        TransportMode: snapshot.TransportMode,
        Availability:  snapshot.Availability,
    }, nil
}

The method does three things only:

  1. Reads the Protobuf request.
  2. Calls the domain dependency.
  3. Converts the domain result into a Protobuf response.

That boundary is valuable. When your transport changes, your business rules do not have to move with it.

Handling Streaming RPCs

Unary RPCs use one request and one response. gRPC also supports streaming patterns:

  • Server streaming: one request, many responses.
  • Client streaming: many requests, one response.
  • Bidirectional streaming: both sides send messages over the same connection.

Bidirectional streaming is useful for live assignment updates, telemetry feeds, negotiation loops, or any workflow where both sides need to keep talking.

func (a *DispatchAdapter) StreamAssignments(
    stream grpc.BidiStreamingServer[*dispatchpb.AssignmentUpdate, *dispatchpb.AssignmentResult],
) error {
    for {
        update, err := stream.Recv()
        if errors.Is(err, io.EOF) {
            return nil
        }
        if err != nil {
            return err
        }

        accepted, err := a.directory.ReviewAssignment(stream.Context(), AssignmentDraft{
            AssignmentID:    update.GetAssignmentId(),
            CourierID:       update.GetCourierId(),
            DestinationZone: update.GetDestinationZone(),
        })
        if err != nil {
            return err
        }

        reply := &dispatchpb.AssignmentResult{
            AssignmentId: update.GetAssignmentId(),
            Accepted:     accepted,
        }
        if err := stream.Send(reply); err != nil {
            return err
        }
    }
}

Streaming code must pay attention to lifecycle. Always handle end-of-stream, transport errors, and context cancellation. A stream that never ends and ignores shutdown signals can block deployments and hold resources longer than intended.

Registering and Running the Server

A production gRPC server is more than a call to Serve. It needs a controlled lifecycle:

  1. Bind to a network port.
  2. Register service implementations.
  3. Run the server without blocking lifecycle control.
  4. Listen for shutdown signals.
  5. Stop accepting new work.
  6. Allow in-flight calls to finish when possible.

Here is a compact pattern:

func RunGRPCServer(cfg ServerConfig, adapter *DispatchAdapter) error {
    listener, err := net.Listen("tcp", fmt.Sprintf(":%d", cfg.GRPCPort))
    if err != nil {
        return fmt.Errorf("listen on grpc port: %w", err)
    }

    server := grpc.NewServer(
        grpc.ConnectionTimeout(time.Duration(cfg.ConnectionTimeoutSec)*time.Second),
        grpc.UnaryInterceptor(LoggingUnaryInterceptor),
        grpc.StreamInterceptor(LoggingStreamInterceptor),
    )

    dispatchpb.RegisterDispatchGatewayServer(server, adapter)

    errCh := make(chan error, 1)
    go func() {
        if err := server.Serve(listener); err != nil {
            errCh <- err
        }
    }()

    sigCh := make(chan os.Signal, 1)
    signal.Notify(sigCh, os.Interrupt, syscall.SIGTERM)

    select {
    case <-sigCh:
        server.GracefulStop()
        return nil
    case err := <-errCh:
        return fmt.Errorf("grpc server failed: %w", err)
    }
}

Binding errors should fail fast. If the port is unavailable, the service should exit and let the process supervisor or orchestrator decide what to do next.

GracefulStop closes the listener, rejects new RPCs, and waits for active handlers to finish. Use immediate stop behavior only for crash-like scenarios where waiting is unsafe.

Graceful shutdown should also be bounded at the platform level. A server that waits forever for a broken stream is not safe. Align server shutdown limits with client timeouts, load-balancer draining, and deployment rules.

Interceptors for Logging, Metrics, Auth, and Tracing

Interceptors are gRPC middleware. They wrap RPC calls so shared behavior can be applied consistently across methods.

Use interceptors for concerns such as:

  • Structured logging.
  • Request duration metrics.
  • Authentication and authorization.
  • Tracing and correlation identifiers.
  • Deadline validation.
  • Error counting.

A unary interceptor wraps standard request/response calls:

func LoggingUnaryInterceptor(
    ctx context.Context,
    req any,
    info *grpc.UnaryServerInfo,
    handler grpc.UnaryHandler,
) (any, error) {
    started := time.Now()

    resp, err := handler(ctx, req)

    log.Printf(
        "grpc_method=%s duration=%s error=%v",
        info.FullMethod,
        time.Since(started),
        err,
    )

    return resp, err
}

A stream interceptor wraps streaming RPCs:

func LoggingStreamInterceptor(
    srv any,
    stream grpc.ServerStream,
    info *grpc.StreamServerInfo,
    handler grpc.StreamHandler,
) error {
    started := time.Now()

    err := handler(srv, stream)

    log.Printf(
        "grpc_stream=%s duration=%s error=%v",
        info.FullMethod,
        time.Since(started),
        err,
    )

    return err
}

The goal is not to log everything possible. The goal is to collect useful, consistent signals without polluting every handler. In hot paths, excessive logging can become expensive. Prefer summarized metrics, structured logs, and traces that help operators answer where a failure happened and why.

Building a Safer gRPC Client

A client that calls another service must expect failure. Network interruptions, rolling deployments, overloaded instances, and downstream errors are normal in distributed systems.

Start with credentials, a target, and options:

func NewDispatchClient(
    target string,
    creds credentials.TransportCredentials,
    policy string,
) (dispatchpb.DispatchGatewayClient, func() error, error) {
    conn, err := grpc.NewClient(
        target,
        grpc.WithTransportCredentials(creds),
        grpc.WithDefaultServiceConfig(policy),
    )
    if err != nil {
        return nil, nil, fmt.Errorf("create grpc client: %w", err)
    }

    cleanup := func() error {
        return conn.Close()
    }

    return dispatchpb.NewDispatchGatewayClient(conn), cleanup, nil
}

Keep insecure transport credentials limited to local development. In production, use credentials that match your organization's service-to-service security policy. For internal systems, this often means TLS or mutual TLS, depending on the platform.

Every call should have a timeout or deadline:

func FetchCourierStatus(
    ctx context.Context,
    client dispatchpb.DispatchGatewayClient,
    courierID string,
) (*dispatchpb.CourierStatusReply, error) {
    callCtx, cancel := context.WithTimeout(ctx, 5*time.Second)
    defer cancel()

    reply, err := client.GetCourierStatus(callCtx, &dispatchpb.CourierLookupRequest{
        CourierId: courierID,
    })
    if err != nil {
        return nil, fmt.Errorf("get courier status: %w", err)
    }

    return reply, nil
}

Retries can help with temporary failures, but they are dangerous when used blindly. A retry policy should define the number of attempts, backoff behavior, and which gRPC status codes are retryable. Repeated retries during a long outage can make the outage worse, so combine retries with deadlines, monitoring, and a circuit breaker.

A circuit breaker fails fast when a downstream service is unhealthy:

breaker := gobreaker.NewCircuitBreaker(gobreaker.Settings{
    Name:        "dispatch-courier-rpc",
    MaxRequests: 2,
    ReadyToTrip: func(counts gobreaker.Counts) bool {
        if counts.Requests < 10 {
            return false
        }
        ratio := float64(counts.TotalFailures) / float64(counts.Requests)
        return ratio >= 0.5
    },
    IsSuccessful: func(err error) bool {
        if err == nil {
            return true
        }

        st, ok := status.FromError(err)
        if !ok {
            return false
        }

        switch st.Code() {
        case codes.InvalidArgument, codes.Unauthenticated, codes.PermissionDenied:
            return true
        default:
            return false
        }
    },
})

Some errors should not trip the circuit breaker. For example, an invalid request is a caller problem, not proof that the downstream service is broken. The next valid request may succeed.

When retries and circuit breakers are still not enough, use graceful degradation. A caller might return cached data, partial results, or a controlled fallback instead of blocking the entire request path.

Client-Side Load Balancing

gRPC can use client-side load balancing. Instead of connecting to one physical server, the client connects to a logical target. A resolver turns that target into multiple addresses, and a load-balancing policy chooses which address handles each call.

Logical target: courier-service
  |
  v
Resolver returns addresses: instance A, instance B, instance C
  |
  v
Load-balancing policy chooses an address per RPC
  |
  v
Client sends the call

Common policies include round_robin, pick_first, weighted_round_robin, ring_hash, outlier_detection, and least_request. A simple starting point is DNS-based round robin, where multiple addresses are returned and calls are distributed across them.

The exact policy should match the system goal. Low latency, resilience, traffic weighting, and outlier handling do not always lead to the same configuration.

Observability Is Part of the Contract

A gRPC service that cannot be observed is hard to operate. At minimum, production services should expose three kinds of signals:

  • Logs: what happened during a call.
  • Metrics: how often it happened and how long it took.
  • Traces: how one request moved across service boundaries.

Interceptors are the natural place to collect these signals because every RPC passes through them. A production system can count calls by method name, measure duration, record status codes, and attach trace context without repeating that code in every handler.

In mature systems, these signals are exported to platforms such as Prometheus, OpenTelemetry collectors, or a vendor observability stack. The important point is not the specific tool. The important point is that operators can see request volume, latency, error rate, and cross-service flow before an incident becomes guesswork.

Testing the Approach

Testing gRPC code is easier when the transport layer is thin.

Test the domain logic through interfaces:

type fakeCourierDirectory struct {
    snapshot CourierSnapshot
    err      error
}

func (f fakeCourierDirectory) FindCourier(ctx context.Context, courierID string) (CourierSnapshot, error) {
    return f.snapshot, f.err
}

func (f fakeCourierDirectory) ReviewAssignment(ctx context.Context, update AssignmentDraft) (bool, error) {
    return true, f.err
}

Then test the adapter conversion behavior without starting the whole platform:

func TestGetCourierStatusMapsDomainToProto(t *testing.T) {
    adapter := NewDispatchAdapter(fakeCourierDirectory{
        snapshot: CourierSnapshot{
            ID:            "courier-42",
            TransportMode: "bike",
            Availability:  "available",
        },
    })

    got, err := adapter.GetCourierStatus(context.Background(), &dispatchpb.CourierLookupRequest{
        CourierId: "courier-42",
    })
    if err != nil {
        t.Fatalf("unexpected error: %v", err)
    }
    if got.GetAvailability() != "available" {
        t.Fatalf("availability mismatch: %s", got.GetAvailability())
    }
}

Also test schema evolution as part of the development workflow:

  1. Add a new field to a message.
  2. Regenerate code.
  3. Confirm the adapter still compiles.
  4. Remove a field only after reserving its tag.
  5. Regenerate again and review the generated files.

The goal is to make contract safety visible before deployment.

Common Mistakes

Using gRPC everywhere

gRPC is excellent for internal calls, but it is not automatically the right public API. External clients may prefer REST because it is easier to debug, document, and evolve independently.

Treating .proto files as temporary implementation files

A .proto file is a contract. Once published, field tags and method names need careful handling. Reusing a removed tag can create subtle compatibility problems.

Putting business rules in generated-code adapters

The adapter should translate between Protobuf and domain types. Business decisions belong in domain services that can be tested without gRPC.

Retrying without deadlines

Retries without timeouts can amplify failures. A slow dependency can become a system-wide problem when many clients repeatedly retry.

Ignoring stream lifecycle

Long-lived streams need cancellation, error handling, and shutdown behavior. Do not let streams keep deployments waiting forever.

Running without observability

If no one can see method latency, error rates, and trace flow, the service will be painful to debug during incidents.

Using insecure transport outside development

Local shortcuts should not become production defaults. Configure transport credentials according to the service-to-service security policy.

Production Checklist

Use this checklist before promoting a gRPC service beyond local development:

  • The .proto file has clear package and Go package settings.
  • Field tags are stable, and removed fields are reserved.
  • Generated files are recreated from the schema, not edited manually.
  • The gRPC adapter only handles transport conversion and delegates business rules.
  • Every RPC accepts and passes context.Context correctly.
  • Unary and stream interceptors handle logging, metrics, tracing, or authentication as needed.
  • The server binds explicitly and fails fast on listener errors.
  • Serve runs in a goroutine so lifecycle control remains available.
  • The process handles SIGINT and SIGTERM.
  • Graceful shutdown is used for normal termination.
  • Client calls use deadlines or timeouts.
  • Retry behavior is bounded and monitored.
  • Circuit breakers or fallbacks protect callers during longer failures.
  • Transport credentials are secure in production.
  • Load balancing is configured for the system's traffic pattern.
  • Logs, metrics, and traces are available in the production observability stack.

Conclusion

gRPC gives Go services a practical way to communicate with speed, structure, and strong contracts. It works especially well for internal service-to-service calls where teams control both sides and value compact payloads, generated clients, streaming, deadlines, and compile-time safety.

A good design starts with the .proto contract, generates Go code, keeps adapters thin, and places operational behavior at the server and client boundaries. Interceptors handle cross-cutting concerns. Graceful shutdown protects in-flight work. Client deadlines, retries, circuit breakers, load balancing, credentials, and observability keep the system usable when real-world failures appear.

Use REST where loose external compatibility matters. Use gRPC where internal systems need fast, typed, maintainable communication. The best architecture is not the one that uses one protocol everywhere. It is the one that chooses each boundary deliberately.

Share:

Comments0

Home Profile Menu Sidebar
Top