3

Is there a standard design pattern I can leverage to consume messages from a queue in order, but have high availability?

I can, of course, divide the load into separate queues by account number last digit (order is only important per account) which gives me scalability, but if the host handling account numbers ending in '2' fails, for example, I need something to pick up that load.

I would think there's a standard pattern for this sort of processing. unfortunately, I can't make the messages idempotent since the queue source is due to integration from a third party.

Any thoughts are greatly appreciated.

James Michael Hare
  • 35,710
  • 9
  • 67
  • 81
  • Could you clarify "to consume messages from a queue in order?" I see 2 possible interpretations: #consume messages in the order they are produced by the source system while preserving global ordering. #consume messages in the order in the context of a single account. Also, do you have a sequence information in messages or just assume that the (single?) source produces messages in some order and they should arrive in the same order. And another Q: what should happen if consumer throws an error? Should the queue stop or continue? Is the timing important or delays are acceptable. – Mike Klimentiev Dec 27 '16 at 23:40

1 Answers1

1

Even though this is a few days old, had to answer a question that featured the word "idempotent". Don't think that there is a design pattern here necessarily, but I have an approach.

I'd go with the separate work queues as you suggested to handle messages in a scalable fashion. A brain-dead simple sorting reader would read from the third party queue and send the messages to the appropriate working queue.

Working Queues

The high availability part would come in with the Working Queue Readers. Each of these would have a buddy queue reader. Each queue reader would send a heartbeat message to its buddy on a regular interval. If the buddy didn't receive a message (or certain number of messages), it would:

  1. Start processing its now dead buddy's queue as well as its own
  2. Notify a system admin about the dead queue reader

In order to prevent two readers hitting the same queue, when a dead queue is revived, it will first re-establish the heartbeat with its buddy. Once the buddy acknowledges that it no longer processing the messages on the revived queue reader's original queue, the revived reader will begin its work again.

Queue Failover

You can get more redundancy by increasing the number of buddies in the group, or having queue readers establish a new buddy when their original buddy dies, but that adds more complication when readers die or return.

One approach to this would be to have a token for each queue. A reader could only read a queue when it possess its token. Each reader would start out owning a single token and broadcast a heartbeat to all the other readers. The heartbeat would include all the tokens for the queues the reader is currently processing. This will provide all the readers with a picture of the system as a whole without requiring a centralized authority. When a reader notices that a token hadn't been broadcast within a certain timeframe, it will claim the token if it:

  1. Has the fewest number of tokens among the surviving readers
  2. Or in case of a tie, has the lowest ID# among the surviving readers with the lowest number of tokens.

Once it claims the token, it will start processing the queue, and send a notification to the system admin.

When a reader comes back online, it will listen to the heartbeats and rebuild its picture of the system. Then it will determine which reader has:

  1. The most tokens
  2. Or in case of a tie, the reader with the highest ID#

The revived reader will claim that last token in the other other reader's list. Once the other reader acknowledges the claim and that is no longer processing the queue represented by the token, the revived reader will begin processing the queue again.

One possible advantage of this approach is that a one-to-one reader-to-queue relationship isn't required. It allows you to create any number of queues that make sense and find a corresponding number of readers that can handle the load.

dbugger
  • 13,859
  • 9
  • 26
  • 29