couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: Fabric worker timeouts and availability of replicas
Date Mon, 12 Oct 2015 16:15:11 GMT
I've had discussions about this in the past and there are a few
sticking points on it that aren't immediately obvious.

First, while the header approach is the most obvious, it misses API's
like POST to _all_docs where we return multiple documents. Each
document returned could have a different read quorum which a header
most likely wouldn't be able to accurately reflect. The obvious next
approach is to add an underscore prefixed field to each document read
(which is actually a fairly simple patch) but that ends up breaking
replication with all old CouchDB nodes in odd ways (it'd only fail
documents that had an incomplete quorum read which is transient). It
suddenly occurs to me that maybe we could condition the inclusion of
the field on the CouchDB user agent though if we can coordinate with
PouchDB and anyone other replicator implementations.

Secondly, the different status codes aren't entirely correct. 201/202
are obviously wrong as they're about entity creation, not read. 203
Non-Authoritative is wrong as the definition says that it reflects
entity header information. 204 No-Content is obviously wrong. 205
Reset Content is wrong and stipulates that no body should be present.
206 Partial Content is also wrong as that's for range requests. And
that's all of the 200 response codes...

My favorite is probably 203 for this as its only a slight bending of
the definition though it does get us into the same mixed response
situation with _all_docs keys and so on.


On Mon, Oct 12, 2015 at 6:06 AM, Michael Rhodes
<mrhodes@linux.vnet.ibm.com> wrote:
> Agreed we should respond with the doc if we got at least one copy.
>
> I'd also be in favour of a reponse header which indicates whether we met the
> requested read quorum. This would mirror the approach to writes, where there
> is currently the separate 201/202 response code based on quorum success.
> This allows for a bit more flexibility client-side w.r.t. availability
> considerations.
>
> I'm not sure the best info to supply in the proposed header, whether it
> could be a simple true/false or more information on the number of nodes that
> responded and the quorum would be useful?
>
> Mike.
>
>
> On 07/10/2015 21:35, Robert Newson wrote:
>>
>> Yes, I think it should. We should return the best answer we can.
>>
>>> On 7 Oct 2015, at 13:48, Robert Kowalski <rok@kowalski.gd> wrote:
>>>
>>> Hi,
>>>
>>> I am currently taking a look at fabric and rexi.
>>>
>>> Given I open a doc, a CouchDB cluster returns the document.
>>>
>>> It also returns a doc, given not all replicas (r) are available and the
>>> *cluster is aware of it*: if the co-ordinator knows that there are fewer
>>> than r replicas available, it returns the document with a 200.
>>>
>>>
>>> When a worker is not available *right now*, and the call to one of them
>>> just times out (so the cluster is not aware that one node is
>>> unavailable),
>>> the Cluster will return a general timeout error instead of a result [1],
>>> even if just one of the worker fails.
>>>
>>> Should the cluster return a result instead in those cases?
>>>
>>>
>>> [1]
>>>
>>> https://github.com/apache/couchdb-fabric/blob/405922c5dff36e0f5822e9a3422243f217d8d0e4/src/fabric_doc_open.erl#L61
>
>

Mime
View raw message