incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hhsuper <hhsu...@gmail.com>
Subject Re: 'Grouping' documents so that a set of documents is passed to the view function
Date Fri, 26 Jun 2009 05:32:44 GMT
by the way, brian, i see many people say that with couchdb, you can download
the data from view and then do sorting/filting/pagination in client, yes you
can do that but i think we absolutely need these feature in couchdb level
just like when we use rdbms it support that, such as when i have a view with
million records i absolutely need paging/sorting on couchdb level, also the
sorting need to be support with any col of the returned keys and returned
values, but now it's difficult to impl that with couchdb, isn't it?

On Fri, Jun 26, 2009 at 10:39 AM, hhsuper <hhsuper@gmail.com> wrote:

> Thx Brian again, I totally understand your description about the map/reduce
> realistic scenario, this is just what i worry about my view, my reduce
> function is non-linear, actually i need the logic in reduce part and the
> that re-reduce code abviouse wrong, but when the realistic reduce occurred,
> like you say "t's quite possible given N documents that couchdb will reduce
> the first N-1, then reduce the last 1" and my logic isn't be run
>
> seem difficult to execute this logic in reduce, except i return a large
> object( which i don't like ) to make sure i can execute the same logic in
> re-reduce part, i can return the additional value which hold the
> {'dialogid':bestScore,....} in the reduce function and that could make sure
> i can execute the same logic in re-reduce part, but when user studied dialog
> more and more, the reduce value got lager and larger
>
> should i get  a conclusion that logic like this isn't proper to implement
> in couchdb's view?
> also as you say i can download all data to client to  caculate, but this is
> very costly and have scalable problem.
>
> > I don't really understand why you need a subquery in rdbms. I would just
> > select all results where uid=x, and process them as required (for
> example:
> > build a hash of dialogid=>bestScore and update it from each received row)
>
> oh, maybe i don't descripe clearly, with a subquery i can used only one sql
> to get all the user's result( i impl a scoreboard) without any other program
> code, and within query i can impl pagination(physic paging) and sorting,
>
> On Thu, Jun 25, 2009 at 5:08 PM, Brian Candler <B.Candler@pobox.com>wrote:
>
>> On Thu, Jun 25, 2009 at 09:34:31AM +0100, Brian Candler wrote:
>> > Perhaps it will help you to understand this if you consider the limiting
>> > case where exactly one document is fed into the 'reduce' function at a
>> time,
>> > and then the outputs of the reduce functions are combined with a large
>> > re-reduce phase.
>>
>> Incidentally, this is a partly realistic scenario. It's quite possible
>> given
>> N documents that couchdb will reduce the first N-1, then reduce the last
>> 1,
>> then re-reduce those two values. This might be because of how the
>> documents
>> are split between Btree nodes, or there may be a limit on the number of
>> documents passed to the reduce function in one go. This is entirely an
>> implementation issue which you have no control over, so you must write
>> your
>> reduce/rereduce to give the same answer for *any* partitioning of
>> documents.
>>
>> More info at http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views
>>
>> "To make incremental Map/Reduce possible, the Reduce function has the
>> requirement that not only must it be referentially transparent, but it
>> must
>> also be commutative and associative for the array value input, to be able
>> reduce on its own output and get the same answer, like this:
>>
>> f(Key, Values) == f(Key, [ f(Key, Values) ] )"
>>
>> Now, at first glance your re-reduce function appears to satisfy that
>> condition, so perhaps there should be another one: namely, that for any
>> partitioning of Values into subsets Values1, Values2, ... then
>>
>>  f(Key, Values) == f(Key, [ f(Key,Values1), f(Key,Values2), ... ] )
>>
>> But I am not a mathematician so I'm not sure if this condition is actually
>> stronger.
>>
>> Regards,
>>
>> Brian.
>>
>
>
>
> --
> Yours sincerely
>
> Jack Su
>



-- 
Yours sincerely

Jack Su

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message