couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From San Sato <sans...@inator.biz>
Subject Re: Partition query endpoints in CouchDB 4.0
Date Wed, 13 May 2020 18:44:52 GMT
I understand that range-based access to fdb-based views gets efficiently
dispatched to node(s) having the key-range, causing zero load on nodes that
aren't having that key-range.  (tangential question: are the data expected
to be co-located with the index/vice versa? or would access patterns tend
to spray across multiple servers as the documents themselves are indirected
from the id-references in the index?  Very possibly I misunderstand
index-to-data storage architecture in f-couchdb; corrections/clarification
gratefully welcomed)

Segmented distribution of reactive data-processing is a valuable use-case,
whether using partitioned _changes or filtered _changes or some other way,
so long as solution architects can count on efficiency and resulting
scalability.  Reactive data-processing agents would then be able to request
a specific feed covering a specific set of one or more
partitions/shards/slices/similar, with a horde of such agents covering the
full set of slices. It is not clear that FDB-based couchdb would be able to
provide assurance for that result as it does for range-based index access.
I imagined using an integer slicing key and a sort of modulus/ring-hash
scheme, or similar mechanism resulting in redundancy/failover+scaling.

I would wish to see erl nodes serving a filtered _changes stream to be
examining (and discarding) no rows, when the filter is index-based and no
changes are being made to those rows matching the filter.  If that outcome
only holds for a special partition-ish class of index, it would still seem
valuable for segmenting the load, both at the functional level of "what
data does a reactive agent see?" as well as "what computational cost does
the couchdb server incur?"   It kind of sounds like getting the generalized
version of that result may be just as much feasible as it may be for the
specialised case, given some possible constraints on index setup /
change-feed patterns.

The gravy on this path would be for _changes feeds to emit
"no-longer-matches" on data that formerly matched a filter (indexed-only?)
but no longer does.  I know this wish wouldn't surprise anyone, and I
understand there's probably a habit of thinking of that result as
out-of-scope; therefore, I wanted to bring it up as a practically valuable
design consideration, in case its feasibility is better with FDB.

 <3 for couchdb  ❤️

Thank you to all.




On Mon, Apr 20, 2020 at 2:05 PM Robert Samuel Newson <rnewson@apache.org>
wrote:

> Hi All,
>
> I'd like to get views on whether we should preserve the _partition
> endpoints in CouchDB 4.0 or remove them. In CouchDB 4.0 all _view and _find
> queries will automatically benefit from the same performance boost that the
> "partitioned database" feature brings, by virtue of FoundationDB.
>
> If we're preserving it, are we also deprecating it (so it's gone in 5.0)?
>
> If we're ditching it, what will the endpoint return instead (404 Not
> Found, 410 Gone?)
>
> Thoughts welcome.
>
> B.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message