incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zachary Zolton <zachary.zol...@gmail.com>
Subject Re: Select row with a count of 1 after reduce
Date Tue, 15 Jun 2010 14:46:29 GMT
Sorry, Darran, I missed that you were talking about doing the
filtering via a _list function; that would work. The only caveat with
_list function filtering is that the view server still has to read all
those k-v pairs from the view (and often a do bit of re-reducing as
well) so you pay a time cost for all that omitted data.

Zach

On Tue, Jun 15, 2010 at 2:03 AM, Darran White <darran.m.white@gmail.com> wrote:
> Thanks for your comments zachery so are you saying you can't use a
> list function after a reduce?i guess I could construct the view
> without a reduce then use the list function to only return the first
> row of a set of distinct ids as this would be the latest entry.
> Thoughi guess this could become inefficent if the view gets large.
>
> Initially I did write it so a jobtask was embedded except a job can
> have many job tasks which hold a state , active,wip,cancelled,etc.. A
> job task is assinged to one person who just updates that particular
> jobs so there should be no conflicts. If it was contained within the
> job then conflicts could occur as multiple users will have to update
> the job.
> I did consider have a bi directional relationship between the job and
> jobtask the only problem is the atomic nature of this.So if the update
> of a new jobtask id to a job fails after the jobtask is created they
> will be out of sync. Which is why I was trying to avoid doing this. I
> guess I could have a wxternal process to clear this up but it doesn't
> seem ideal
>
> Darran
>
> Sent from my iPhone
>
> On 15 Jun 2010, at 02:13, Zachary Zolton <zachary.zolton@gmail.com>
> wrote:
>
>> Can a JobTask belong to more than one Job? If not, I'd just nest them
>> within the Job document, which would make identifying Job that don't
>> have any JobTasks easier. Alternatively, you could consider storing an
>> array of JobTask IDs in the Job document.
>>
>> FYI, you cannot sort/filter reduce values.
>>
>> On Mon, Jun 14, 2010 at 5:53 PM, Darran White <darran.m.white@gmail.com
>> > wrote:
>>> Hi,
>>> I have the following view:-
>>> function(doc) {
>>>    if(doc.doctype=='job'){
>>>        emit([doc._id,doc.createdDate,'UNASSIGNED','CREATED'],
>>> doc._id);
>>>     }else if(doc.doctype=='jobTask'){
>>>
>>> emit
>>> ([doc.jobId,doc.taskLatestDate,'ASSIGNED',doc.taskName],doc.jobId);
>>>     }
>>> }
>>>
>>> So I have a Job and a JobTask the JobTask holds a reference to the
>>> Job in a
>>> jobId field.
>>> What I would like to do is select all the jobs which don`t have any
>>> associated JobTasks.
>>> I was thinking of using a simple reduce function :-
>>>
>>> function(keys, values, rereduce) {
>>>  return values.length;
>>> }
>>>
>>> So with a data set:
>>> ["0e482a01111e0c0932112f73c8001f68", "13-06-2010 20:42:22:613",
>>> "ASSIGNED",
>>> "ACTIVE"]
>>> ["0e482a01111e0c0932112f73c8001f68", "13-06-2010 20:40:13:423",
>>> "UNASSIGNED", "CREATED"]
>>> ["0e482a01111e0c0932112f73c80011ca", "13-06-2010 15:36:13:371",
>>> "UNASSIGNED", "CREATED"]
>>> ["0e482a01111e0c0932112f73c800116a", "13-06-2010 19:20:48:541",
>>> "ASSIGNED",
>>> "ACTIVE"]
>>> ["0e482a01111e0c0932112f73c800116a", "13-06-2010 15:35:59:912",
>>> "UNASSIGNED", "CREATED"]
>>> ["0e482a01111e0c0932112f73c80005a5", "13-06-2010 19:20:48:507",
>>> "ASSIGNED",
>>> "WIP"]
>>> ["0e482a01111e0c0932112f73c80005a5", "13-06-2010 15:33:02:947",
>>> "UNASSIGNED", "CREATED"]
>>>
>>> It would be reduced to:
>>> ["0e482a01111e0c0932112f73c8001f68"]    2
>>> ["0e482a01111e0c0932112f73c80011ca"]    1
>>> ["0e482a01111e0c0932112f73c800116a"]    2
>>> ["0e482a01111e0c0932112f73c80005a5"]    2
>>>
>>> I would then like to select all those rows which have a count of 1
>>> and load
>>> the associated documents. However I`m not sure on how to select
>>> only those
>>> which have a value of 1 or if this is possible.
>>> I was thinking of maybe using a list function to only return those
>>> ids where
>>> the value is 1 would this be the right way to approach this?
>>>
>>> Another idea would be to get the latest date so
>>>
>>> ["0e482a01111e0c0932112f73c8001f68", "13-06-2010 20:42:22:613",
>>> "ASSIGNED",
>>> "ACTIVE"]
>>> ["0e482a01111e0c0932112f73c8001f68", "13-06-2010 20:40:13:423",
>>> "UNASSIGNED", "CREATED"]
>>> ["0e482a01111e0c0932112f73c80005a5", "13-06-2010 19:20:48:507",
>>> "ASSIGNED",
>>> "WIP"]
>>> ["0e482a01111e0c0932112f73c80005a5", "13-06-2010 15:33:02:947",
>>> "UNASSIGNED", "CREATED"]
>>> ["0e482a01111e0c0932112f73c80011ca", "13-06-2010 15:36:13:371",
>>> "UNASSIGNED", "CREATED"]
>>>
>>> Would return
>>> ["0e482a01111e0c0932112f73c8001f68", "13-06-2010 20:42:22:613",
>>> "ASSIGNED",
>>> "ACTIVE"]
>>> ["0e482a01111e0c0932112f73c80005a5", "13-06-2010 19:20:48:507",
>>> "ASSIGNED",
>>> "WIP"]
>>> ["0e482a01111e0c0932112f73c80011ca", "13-06-2010 15:36:13:371",
>>> "UNASSIGNED", "CREATED"]
>>>
>>> I could then filter on "CREATED"
>>> I`m aware of the limit function but I need to get the latest entry
>>> for all
>>> documents does any one have an idea on how to do this?
>>>
>>> thanks
>>>
>>> Darran
>>>
>

Mime
View raw message