Preventing Lost and Duplicate Jobs When a .NET Worker Moves from gRPC Queues to RabbitMQ

A .NET order-processing worker begins with a simple communication model. An ASP.NET Core gRPC endpoint receives a job, places it in a local queue, and immediately confirms receipt. Background threads remove queued jobs and update the order database.

The design works while traffic is moderate and one deployment owns the queue. Problems appear when the worker must run on several machines, survive process restarts, or keep accepting work while an instance is unavailable. A local queue can disappear with the process, and a sender cannot safely assume that a successful network call means the job was stored permanently.

Moving the queue to RabbitMQ solves part of the problem, but it does not create exactly-once processing. Publisher confirmation can be lost after the broker accepted a message. A worker can update the database and crash before acknowledging the delivery. Both situations cause retries, which means the same job may be published or processed more than once.

The safe target is therefore not "process every message exactly once." The practical target is:

Store messages durably before confirming publication.
Keep messages unavailable to other consumers while one worker processes them.
Remove a message only after successful processing.
Requeue failed or unconfirmed deliveries.
Make repeated processing produce the same final business state.

Context and Scope

Assume an order platform contains three components:

Order API accepts a confirmed order.
Dispatch publisher converts the order into a transport message and sends it to RabbitMQ.
Dispatch worker consumes the message and creates the delivery job.

Order API
   |
   v
Dispatch Publisher
   |
   v
RabbitMQ Exchange
   |
   v
Durable Queue
   |
   v
Dispatch Worker
   |
   v
Delivery Data

RabbitMQ handles the communication queue. Protobuf defines the binary message contract. A .NET worker service processes deliveries outside the request that created the order.

The critical failure cases are:

The publisher sends a message but does not receive confirmation.
The broker or worker restarts while jobs are waiting.
The worker receives a message and throws an exception.
The worker commits its business update but crashes before sending an acknowledgment.
An older worker receives a message created by a newer publisher.

The implementation must handle all five without silently losing work or applying the same business effect twice.

Why an Internal gRPC Queue Stops Scaling

gRPC is useful for efficient Remote Procedure Call communication. It uses HTTP/2 and normally uses Protobuf messages. A sender can call a gRPC method that places a job in the receiver's queue and returns immediately.

Publisher
   |
   | gRPC call
   v
Worker endpoint
   |
   | enqueue
   v
Internal persistent queue
   |
   v
Parallel processors

This design can provide good response times for a small or medium system. The gRPC method performs only a short operation: accept the message, store it, and return confirmation. Business processing happens later.

The difficulty is ownership of the queue. Running several worker instances requires them to share one persistent queue or partition the queue across several stores. The team must implement locking, timeouts, retries, delivery confirmation, recovery, and duplicate detection.

A message broker moves this communication responsibility into a dedicated component. RabbitMQ can route messages, preserve queued work, distribute jobs among competing consumers, and requeue work that was not acknowledged.

Step 1: Define a Version-Tolerant Protobuf Contract

RabbitMQ messages are byte arrays. The application must serialize each .NET object before publishing it and deserialize the bytes after delivery.

Platform-specific binary serialization can be fast, but it creates two important risks:

A service written with another language or runtime may not understand the format.
Adding or removing properties may break older consumers.

Protobuf solves these problems by defining language-independent primitive types and assigning a permanent integer number to every field.

syntax = "proto3";

import "google/protobuf/timestamp.proto";

package dispatch;

message DispatchRequested {
  string message_id = 1;
  string order_id = 2;
  string destination_code = 3;
  int32 parcel_count = 4;
  google.protobuf.Timestamp requested_at = 5;
}

The field numbers are part of the wire contract. Keep these rules:

Do not change the number assigned to an existing field.
Do not reuse the number of a removed field.
Prefer adding new fields instead of changing the meaning of old fields.
Let older consumers ignore fields they do not understand.
Expect missing fields to receive their Protobuf default values.

The message_id exists for idempotency. The order_id is the business identifier. They solve different problems: one identifies the delivery attempt, while the other identifies the order being processed.

Step 2: Serialize the Message Before Publishing

The Protobuf compiler generates a .NET class from the .proto definition. The generated type can write itself to a stream.

var message = new DispatchRequested
{
    MessageId = Guid.NewGuid().ToString(),
    OrderId = order.Id,
    DestinationCode = order.DestinationCode,
    ParcelCount = order.ParcelCount,
    RequestedAt = Timestamp.FromDateTime(DateTime.UtcNow)
};

byte[] body;

using (var stream = new MemoryStream())
{
    message.WriteTo(stream);
    stream.Flush();
    body = stream.ToArray();
}

On the consumer side, parse the byte array with the generated parser.

DispatchRequested message;

using (var stream = new MemoryStream(deliveryBody))
{
    message = DispatchRequested.Parser.ParseFrom(stream);
}

Keep domain objects separate from transport messages. The Protobuf type is an integration contract. The worker should validate and translate it before calling business logic.

Step 3: Route Jobs Through the Correct RabbitMQ Exchange

RabbitMQ publishers send messages to exchanges. The exchange decides which queue receives each message.

For a dispatch job that should be handled by one of several competing workers, a direct route through the default exchange is sufficient. Every worker listens to the same queue, and each message is assigned to one consumer.

For an event that several independent services must receive, use a fanout exchange. Each subscriber has its own queue and receives a copy.

Communication need	RabbitMQ routing choice
One worker instance should process each job	One queue through a direct route
Every interested service should receive the event	Fanout exchange with one queue per subscriber
Subscribers select messages by routing pattern	Topic exchange

Do not create several queues for competing workers unless each worker must receive a separate copy. Multiple instances should normally consume from the same work queue.

Step 4: Confirm That RabbitMQ Accepted the Publication

The publisher should not treat a call that writes bytes to a connection as proof that the message is safe. It needs confirmation that the broker accepted the publication.

The publication flow should be:

Create message
   |
   v
Serialize to bytes
   |
   v
Publish to RabbitMQ
   |
   v
Wait for broker confirmation
   |
   +--> Confirmed: complete the sender operation
   |
   +--> Timed out or failed: retry safely

If confirmation does not arrive, the publisher cannot know whether RabbitMQ stored the message. Retrying is necessary to prevent message loss, but the retry may publish a duplicate.

That uncertainty is why the consumer must be idempotent. Publisher confirmation improves reliability, but it does not eliminate duplicate delivery.

The queue and message should also be configured for persistence so a broker restart does not intentionally discard accepted jobs. Persistence and confirmation solve different concerns:

Persistence protects queued messages across restarts.
Confirmation tells the publisher that RabbitMQ accepted responsibility for the message.

Step 5: Disable Automatic Consumer Acknowledgment

With automatic acknowledgment, RabbitMQ can consider a message complete as soon as it is delivered to the consumer. If the process crashes before finishing the database update, the job may be lost.

Use manual acknowledgment instead.

RabbitMQ delivers job
   |
   v
Message becomes unavailable to other consumers
   |
   v
Worker processes business operation
   |
   +--> Success: Ack, remove message
   |
   +--> Failure: Nack and requeue
   |
   +--> Connection closes without Ack: message becomes available again

The consumer should start with automatic acknowledgment disabled. After successful business processing, acknowledge the specific delivery.

consumer.Received += (_, delivery) =>
{
    try
    {
        var body = delivery.Body.ToArray();
        var message = DispatchRequested.Parser.ParseFrom(body);

        processor.Process(message);

        channel.BasicAck(
            deliveryTag: delivery.DeliveryTag,
            multiple: false);
    }
    catch
    {
        channel.BasicNack(
            deliveryTag: delivery.DeliveryTag,
            multiple: false,
            requeue: true);
    }
};

channel.BasicConsume(
    queue: queueName,
    autoAck: false,
    consumer: consumer);

multiple: false limits the acknowledgment or rejection to the identified delivery. requeue: true makes a failed message available for another processing attempt.

A worker must also close or dispose its channel and connection during shutdown. An unacknowledged message remains blocked while the connection is active. Closing the connection allows RabbitMQ to make it available again.

Step 6: Make the Business Operation Idempotent

Manual acknowledgment prevents silent deletion before processing, but it guarantees that duplicates can occur.

Consider this sequence:

The worker creates the delivery record.
The database commits successfully.
The process stops before BasicAck.
RabbitMQ redelivers the same message.
Another worker attempts to create the delivery again.

A plain insert is not idempotent. Running it twice can create duplicate deliveries.

Store processed message identifiers and apply the business change only when the identifier is new.

public sealed class DispatchMessageHandler
{
    private readonly IProcessedMessageStore _processedMessages;
    private readonly IDeliveryRepository _deliveries;

    public DispatchMessageHandler(
        IProcessedMessageStore processedMessages,
        IDeliveryRepository deliveries)
    {
        _processedMessages = processedMessages;
        _deliveries = deliveries;
    }

    public async Task HandleAsync(
        DispatchRequested message,
        CancellationToken cancellationToken)
    {
        if (await _processedMessages.ContainsAsync(
            message.MessageId,
            cancellationToken))
        {
            return;
        }

        await _deliveries.CreateAsync(
            message.OrderId,
            message.DestinationCode,
            message.ParcelCount,
            cancellationToken);

        await _processedMessages.AddAsync(
            message.MessageId,
            cancellationToken);
    }
}

The delivery update and processed-message record must be committed together in one local transaction. Otherwise, the worker can apply the business change and fail before recording the identifier, which recreates the duplicate problem.

An update that sets a record to a known final state is naturally easier to repeat than an operation that increments a value or inserts a new record. Design message handlers around final state where possible.

Step 7: Run the Consumer as a .NET Worker Service

A .NET worker service provides dependency injection, configuration, logging, cancellation, and controlled shutdown.

public sealed class DispatchWorker : BackgroundService
{
    private readonly IMessageConsumer _consumer;

    public DispatchWorker(IMessageConsumer consumer)
    {
        _consumer = consumer;
    }

    protected override Task ExecuteAsync(
        CancellationToken stoppingToken)
    {
        return _consumer.ListenAsync(stoppingToken);
    }
}

var builder = Host.CreateApplicationBuilder(args);

builder.Services.AddSingleton<IMessageConsumer, RabbitMqDispatchConsumer>();
builder.Services.AddScoped<DispatchMessageHandler>();
builder.Services.AddHostedService<DispatchWorker>();

using var host = builder.Build();

await host.RunAsync();

Create a dependency injection scope for each delivered message when the handler uses scoped services. Do not reuse one scoped database context across unrelated deliveries.

The cancellation token should stop new processing, dispose the RabbitMQ resources, and allow unacknowledged jobs to return to the queue.

Step 8: Keep Distributed Business Work Out of One Database Transaction

A local transaction can atomically update the delivery data and the processed-message identifier. It cannot safely keep databases in several microservices locked while asynchronous messages travel between them.

For a workflow that spans Order, Payment, and Dispatch, use a saga:

Order accepted
   |
   v
Reserve payment
   |
   v
Create dispatch
   |
   +--> Success: complete order
   |
   +--> Failure: release payment and cancel order

Each service completes a local operation and publishes the next message. When a permanent failure occurs, compensating operations undo previously completed work.

A saga can be coordinated in two ways:

Orchestration: one coordinator sends commands and receives success or failure results.
Choreography: each service reacts to events and publishes the next event without one central coordinator.

Compensation is business behavior, not a database rollback. Before designing the saga, identify which completed actions can be reversed and what information must be stored to perform that reversal.

Testing the Failure Paths

A message-based worker is not adequately tested by proving that one valid message succeeds.

Test these scenarios:

Publish one message and verify one delivery is created.
Deliver the same message_id twice and verify only one business effect.
Throw before the database commit and verify the message is requeued.
Commit the database update but skip the acknowledgment, then redeliver the message.
Stop the worker while a message is unacknowledged and verify another worker can receive it.
Add a new optional Protobuf field and verify an older consumer still processes the known fields.
Send a malformed message and verify it does not produce a partial business update.
Simulate a permanent saga failure and verify the compensating operation.

The most important tests deliberately create uncertainty between publication, processing, persistence, and acknowledgment.

Common Mistakes

Using an in-memory queue for important work

Process memory is not permanent storage. A restart can remove accepted jobs.

Enabling automatic acknowledgment

The broker may remove a message before the business operation succeeds.

Assuming publisher confirmation prevents duplicates

A publisher can retry when the confirmation is lost even though RabbitMQ accepted the first publication.

Reusing Protobuf field numbers

Field numbers are stable wire identifiers. Reassigning one can make old and new services interpret the same bytes differently.

Acknowledging before the local transaction commits

A successful acknowledgment followed by a failed database commit loses the job.

Recording idempotency in a separate transaction

The processed-message identifier and business update must succeed or fail together.

Requeueing every permanent failure forever

A malformed or permanently invalid message will not improve through unlimited retries. Define a policy for messages that cannot succeed.

Treating a saga as a cross-service rollback

A saga uses compensating business actions. It does not keep one database transaction open across all services.

Implementation Checklist

[ ] Define transport contracts in .proto files
[ ] Keep every Protobuf field number stable
[ ] Add a unique message identifier
[ ] Serialize messages before publishing
[ ] Choose direct, fanout, or topic routing based on delivery semantics
[ ] Use persistent queues and messages for important jobs
[ ] Wait for publisher confirmation
[ ] Expect publication retries to create duplicates
[ ] Disable automatic consumer acknowledgment
[ ] Acknowledge only after successful business persistence
[ ] Reject and requeue temporary processing failures
[ ] Store processed message identifiers
[ ] Commit idempotency data and business data together
[ ] Dispose RabbitMQ channels and connections during shutdown
[ ] Create dependency injection scopes per message
[ ] Use saga compensation for multi-service workflows
[ ] Test crashes before and after the database commit

Conclusion

Replacing an internal gRPC queue with RabbitMQ improves scalability and removes much of the custom communication work, but the broker cannot promise that application code runs exactly once.

Reliable .NET workers combine several protections. Protobuf keeps binary contracts compact and version tolerant. RabbitMQ persistence and publisher confirmation reduce message loss. Manual acknowledgment ensures that unfinished work returns to the queue. Idempotent handlers make retries safe, and local transactions keep duplicate detection consistent with business updates.

Once those guarantees are explicit, worker instances can fail, restart, and compete for jobs without turning normal delivery uncertainty into missing or duplicated business operations.