incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: multiple range queries via POST?
Date Tue, 27 Oct 2009 15:22:35 GMT
Alex,

Views are streamed from the database with no buffering in RAM. To the
point that all operations must keep those semantics.

There's an old ticket on striped queries that's quite similar on what
you're wanting at [1]. It had a patch that probably won't at all apply
to trunk but the idea is there. In general though I'm not overly
convinced on the necessity. Assuming your HTTP client is capable of
persistent connections I don't see this as having a huge effect in
terms of speed in normal use. Perhaps if you're heavy on the feature,
but as part of the general scheme of things, there's not much that
you'd be able to implement that would give you a boost.

Oh.... Actually, one thing you could convince me on is single reader
snapshots for the view that you couldn't do with requests. Either way,
the implementation wouldn't be *too* difficult. It should be able to
sit almost 100% in couch_httpd_view.erl and would be able to pull alot
from output_map_view and output_reduce_view.

Also, I wouldn't attempt to resolve overlaps. I think it'd be more
confusing to have merged result sets than just having a single result
for each sub-request. Ie, the return would just be an array of view
outputs. Or something.

HTH,
Paul Davis

[1] http://issues.apache.org/jira/browse/COUCHDB-244


On Tue, Oct 27, 2009 at 11:01 AM, Alex P <apedenko@kolosy.com> wrote:
> good to hear. re: http call - well it should be consistent with the keys
> call, shouldn't it? so a POST with startKey[s] and an endKey[s] arguments?
>
> out of curiousity - when a view subset is returned, is it 'streamed' out? or
> is the entire dataset prefetched and then returned?
>
> i could see overlapping ranges being simple to solve mathematically, but
> posing either seek or memory issues (reading the two ranges concurrently vs.
> pre-fetching both and doing a merge)
>
> On Tue, Oct 27, 2009 at 9:55 AM, Adam Kocoloski <kocolosk@apache.org> wrote:
>
>> On Oct 27, 2009, at 10:50 AM, Alex P wrote:
>>
>>  i know this is currently unsupported (and may be more of a question for
>>> the
>>> dev list), but is there a technical reason while multi-range queries can't
>>> be submitted to couch (slight ah-hah moment at the end)?
>>>
>>> the specific problem i'm trying to address is this:
>>>
>>> suppose i have a message document, and a corresponding map function:
>>>
>>> function (doc) {
>>>  if (doc.docType != 'message') return;
>>>
>>>  emit(doc.owner, null);
>>> }
>>>
>>> if i wanted to pull back all messages for users foo and bar, i'd simply do
>>> a
>>> POST path/to/couch keys = ['foo', 'bar']. now let's make this data come
>>> back
>>> sorted by create date:
>>>
>>> function (doc) {
>>>  if (doc.docType != 'message') return;
>>>
>>>  emit([doc.owner, doc.createDate], null);
>>> }
>>>
>>> also cool, but now, to retrieve all messages pertaining to a single user,
>>> i
>>> need to do GET path/to/couch startKey=['foo']&endKey=['foo', 'a']. this
>>> works, but it now means that if i want all messages pertaining to both foo
>>> and bar, i need to run two separate queries.
>>>
>>> as i'm writing this, i think i'm starting to see that the problem would be
>>> with having to merge overlapping ranges, but i still would like someone
>>> else
>>> to weigh in on this
>>>
>>>
>>> thanks,
>>> alex.
>>>
>>
>> Hi Alex, internally, multiple keys are actually just a special case of
>> multiple ranges.  So that part is easy.  We would want to be clear about how
>> we handle overlapping ranges, but it's not that hard of a problem really.
>>
>> I wonder what the HTTP call for this should look like?
>>
>> Adam
>>
>>
>

Mime
View raw message