Read quorum in CouchDb for _find and MapReduce queries

Question

The CouchDb documentation indicates that, by default, both reads and writes to single documents are quorum reads and writes (e.g. r=2 and w=2 in a 3-replica system).

However, the documentation for _find says r "defaults to 1, in which case the document found in the index is returned. If set to a higher value, each document is read from at least that many replicas before it is returned..." It is not 100% clear to me however what exactly that means. If I run _find with r=2 and I find a document in the index of a single node I think it's fairly clear that it will also fetch that document from a 2nd node and return the latest to me. However, I think it's still only checking the index on one node so consistency in a healthy cluster isn't guaranteed.

For example, suppose I have a healthy 3 node cluster with no network partitions. The DB in this cluster has a Mango index that includes field foo and I query, via _find, for all documents with foo=bar. Let's say that initially document X has value foo=baz so that X should not be returned. Now, I update X setting foo=bar and I do this with w=2. I then immediately re-run my _find with r=2. If the index is only consulted on one node then I'm not guaranteed to have X returned by my query even with r=2. So does r=2 mean only that documents found in one node's index will also be looked up on a 2nd node or does it mean that the index on 2 nodes will run the query and have their results merged.

Also, it seems like the same index and r=1 default would likely apply to JavaScript MapReduce views, but I see no equivalent documentation for that case. Do MapReduce view queries default to r=1 or r=2?

score 0 · Answer 1 · answered Jul 30 '20 at 21:58

0

I posted a link to the above on the CouchDb Slack and got a response:

map/reduce views (accessed by _view) are r=1 and the r parameter mentioned for _find I think only refers to when we fetch the document, not when the query itself runs, but I'm not 100% sure on that.

It's not quite a definitive answer so I'm not marking this correct but it is definitely more information that I had before.

answered Jul 30 '20 at 21:58

Oliver Dain

8,273
3
27
43

I think a quorum for view queries is impractical, if not outright impossible in some cases, due to the eventual-consistency nature of CouchDB. To ensure a view quorum, a quorum of nodes would effectively have to "freeze the world", make sure they've received the same updates, then return a result. That would pretty well go against the intention of eventual consistency. So I expect there's an implicit quorum of 1 for every index lookup, and to match that, the document retrieval also uses r=1 by default. – Flimzy Jul 31 '20 at 15:51
... r=2 would thus presumably query a single index, but then verify the document from 2 nodes before returning it. (of course: only for queries that actually return documents) – Flimzy Jul 31 '20 at 15:53
@Flimzy, that makes sense. It does cause some difficulty for the programmer at times but it does make sense. – Oliver Dain Jul 31 '20 at 22:30
Is there a specific problem you're trying to solve? Or just asking out of curiosity? – Flimzy Aug 01 '20 at 12:42
@Flimzy there is a specific problem but explaining it would take a very long time. I think I've got what I need here. Thanks! – Oliver Dain Aug 01 '20 at 23:12

Read quorum in CouchDb for _find and MapReduce queries

1 Answers1