couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Samuel Newson <rnew...@apache.org>
Subject Re: Fabric worker timeouts and availability of replicas
Date Mon, 12 Oct 2015 23:10:44 GMT
The 203 (Non-Authoritative Information) status code indicates that
   the request was successful but the enclosed payload has been modified
   from that of the origin server's 200 (OK) response by a transforming
   proxy (
Section 5.7.2 of [RFC7230]).

So I don’t think we can send a 203.

We could maybe use a response header if we list all the quorum counts, but then we’ll hit
header length issues for large bulk_docs posts, though it might be sufficient to indicate
that at least one of the responses did not meet quorum?

We should also stop using the word quorum, it implies properties we don’t have. Quorum should
be reserved for systems exhibiting strong consistency properties.

B.

> On 12 Oct 2015, at 17:15, Paul Davis <paul.joseph.davis@gmail.com> wrote:
> 
> I've had discussions about this in the past and there are a few
> sticking points on it that aren't immediately obvious.
> 
> First, while the header approach is the most obvious, it misses API's
> like POST to _all_docs where we return multiple documents. Each
> document returned could have a different read quorum which a header
> most likely wouldn't be able to accurately reflect. The obvious next
> approach is to add an underscore prefixed field to each document read
> (which is actually a fairly simple patch) but that ends up breaking
> replication with all old CouchDB nodes in odd ways (it'd only fail
> documents that had an incomplete quorum read which is transient). It
> suddenly occurs to me that maybe we could condition the inclusion of
> the field on the CouchDB user agent though if we can coordinate with
> PouchDB and anyone other replicator implementations.
> 
> Secondly, the different status codes aren't entirely correct. 201/202
> are obviously wrong as they're about entity creation, not read. 203
> Non-Authoritative is wrong as the definition says that it reflects
> entity header information. 204 No-Content is obviously wrong. 205
> Reset Content is wrong and stipulates that no body should be present.
> 206 Partial Content is also wrong as that's for range requests. And
> that's all of the 200 response codes...
> 
> My favorite is probably 203 for this as its only a slight bending of
> the definition though it does get us into the same mixed response
> situation with _all_docs keys and so on.
> 
> 
> On Mon, Oct 12, 2015 at 6:06 AM, Michael Rhodes
> <mrhodes@linux.vnet.ibm.com> wrote:
>> Agreed we should respond with the doc if we got at least one copy.
>> 
>> I'd also be in favour of a reponse header which indicates whether we met the
>> requested read quorum. This would mirror the approach to writes, where there
>> is currently the separate 201/202 response code based on quorum success.
>> This allows for a bit more flexibility client-side w.r.t. availability
>> considerations.
>> 
>> I'm not sure the best info to supply in the proposed header, whether it
>> could be a simple true/false or more information on the number of nodes that
>> responded and the quorum would be useful?
>> 
>> Mike.
>> 
>> 
>> On 07/10/2015 21:35, Robert Newson wrote:
>>> 
>>> Yes, I think it should. We should return the best answer we can.
>>> 
>>>> On 7 Oct 2015, at 13:48, Robert Kowalski <rok@kowalski.gd> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> I am currently taking a look at fabric and rexi.
>>>> 
>>>> Given I open a doc, a CouchDB cluster returns the document.
>>>> 
>>>> It also returns a doc, given not all replicas (r) are available and the
>>>> *cluster is aware of it*: if the co-ordinator knows that there are fewer
>>>> than r replicas available, it returns the document with a 200.
>>>> 
>>>> 
>>>> When a worker is not available *right now*, and the call to one of them
>>>> just times out (so the cluster is not aware that one node is
>>>> unavailable),
>>>> the Cluster will return a general timeout error instead of a result [1],
>>>> even if just one of the worker fails.
>>>> 
>>>> Should the cluster return a result instead in those cases?
>>>> 
>>>> 
>>>> [1]
>>>> 
>>>> https://github.com/apache/couchdb-fabric/blob/405922c5dff36e0f5822e9a3422243f217d8d0e4/src/fabric_doc_open.erl#L61
>> 
>> 


Mime
View raw message