From user-return-15655-apmail-couchdb-user-archive=couchdb.apache.org@couchdb.apache.org Wed Apr 06 07:07:00 2011 Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 74544 invoked from network); 6 Apr 2011 07:07:00 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 6 Apr 2011 07:07:00 -0000 Received: (qmail 82562 invoked by uid 500); 6 Apr 2011 07:06:58 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 82536 invoked by uid 500); 6 Apr 2011 07:06:58 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 82524 invoked by uid 99); 6 Apr 2011 07:06:56 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Apr 2011 07:06:56 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of luismiguelferreirasilva@gmail.com designates 209.85.210.52 as permitted sender) Received: from [209.85.210.52] (HELO mail-pz0-f52.google.com) (209.85.210.52) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Apr 2011 07:06:50 +0000 Received: by pzk12 with SMTP id 12so794703pzk.11 for ; Wed, 06 Apr 2011 00:06:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type:content-transfer-encoding; bh=dLqkCKO3NeaxqYxAPY2pxyRmcnlt2eXLPHast4zSFXc=; b=h+15QCstN3ZXFoUrNOnb0AO0mFjbaoW4tI6iTTy198MhKpx1MRZoFVnm45cj/ZtcYq L2NA0BkBXM7lmC8x+xmYUY0zeSBG4DXXTikUYCAr8PAjZ5r4qw5dzprYAojFr/N4JL5U YxkhWwHaYkkDUV5OpMu1zme3wrDzShvqGBNyU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; b=tTM5pcuol2FXlS9wfLZsPN710jkrZTEiifvHAWInGSxP2Z0KcoTvxjcewwSqFJm8Fr ZK4T40qW4zPYByLz1n2ET+D2wEXE7cB8Iqu6FBjq9DIy6fPV36GsJY5jEN2aPLmPCVIZ KxNPMQKXUOEiGh3pMHkfyJGgadAHKpO+ngN5g= Received: by 10.142.240.17 with SMTP id n17mr635736wfh.105.1302073590136; Wed, 06 Apr 2011 00:06:30 -0700 (PDT) MIME-Version: 1.0 Received: by 10.68.52.168 with HTTP; Wed, 6 Apr 2011 00:06:10 -0700 (PDT) In-Reply-To: References: From: Luis Miguel Silva Date: Wed, 6 Apr 2011 01:06:10 -0600 Message-ID: Subject: Re: Update conflicts? To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable But is there no way to do it server side? :o) That would be SOOOO much better as i want to maintain a "single view of the database" (so that everybody querying the same view gets the same results). Plus, your approach doesn't allow me to specify my own attribute names (does it??): i.e. emit(doc._id, { node: doc._id, STATE: doc.secondary_state, OS: doc.oslist, ALIAS: doc.alias, FEATURE: doc.vlans, "GMETRIC[numvms]": doc.numvms, NETADDR: doc.netaddress, VARATTR: { "HVTYPE":doc.hvtype}, VARIABLE: doc.variables, OSLIST: doc.oslist, VMOSLIST: doc.vmoslist, } ); Like i mentioned in a previous document, that is a HUGE deal to us because the attributes themselves have no meaning to the consumers. So that is why it is EXTREMELY important for us to shape the information in a meaningful way on the server side! p.s. thank you so much for your help. On Wed, Apr 6, 2011 at 12:56 AM, Anup Bishnoi wro= te: > you've already got the answer here > > On Wed, Apr 6, 2011 at 12:19 PM, Luis Miguel Silva > wrote: >> >> Yeah but the above view generates different documents: >> {"total_rows":4,"offset":0,"rows":[ >> >> {"id":"92fe8c96f90e21d68a414bbd1700f3d7","key":["node01","cpu",129979453= 2000,0.94],"value":null}, >> >> {"id":"92fe8c96f90e21d68a414bbd1700ffee","key":["node01","disk",12997945= 32000,null],"value":null}, >> >> {"id":"92fe8c96f90e21d68a414bbd1701180e","key":["node01","generic",12997= 94532000,null],"value":null}, >> >> {"id":"92fe8c96f90e21d68a414bbd170109ce","key":["node01","netio",1299794= 532000,null],"value":null} >> ]} >> > i'm assuming you're making this view query with ajax and you get these > results. > now all you need to do is walk through these response items with client s= ide > js and build the one doc you need! all the pieces required to build the d= oc > are already there with you in your client side js > i'll be happy to keep answering, lets get this solved > >> >> Any way i can return ONE single doc per full result? >> i.e. something like: >> {"total_rows":1,"offset":0,"rows":[ >> >> {"id":"node01","key":"node01","value":{"node":"node01","STATE":"Unknown:= sshd","ALIAS":"node01","FEATURE":"[vlan611]","GMETRIC[numvms]":13,"NETADDR"= :"10.40.130.146","VARATTR":{"HVTYPE":"esx"},"VARIABLE":[{"provision_status"= :2},{"another_variable":"something"}]}}, >> ]} >> >> (or, in other words, joining all the fields from the different >> documents in one single doc)?? >> >> On Wed, Apr 6, 2011 at 12:34 AM, Anup Bishnoi >> wrote: >> > you could join the different pieces of information about the node (whi= ch >> > you >> > get by one query on the view suggested above) on the page itself with >> > javascript, instead of asking couch for everything embedded in an html >> > response >> > >> > On Wed, Apr 6, 2011 at 11:53 AM, Luis Miguel Silva >> > wrote: >> >> >> >> Sorry if my last email was too big :o). >> >> >> >> Well, one reason i wanted to avoid doing that is because it didn't >> >> seem as easy to maintain as my original approach but i'll discuss you= r >> >> suggestion with my team to see what they have to say. >> >> Also, i just couldn't get join to work :o\... >> >> >> >> How would you create a view that joins data from those different type= s >> >> of documents to create a single complete view of a node? >> >> I've read the documentation on view joins but simply could not get it >> >> to work :o\... >> >> >> >> Thank you, >> >> Luis >> >> >> >> On Tue, Apr 5, 2011 at 9:12 PM, Ryan Ramage >> >> wrote: >> >> > Luis, >> >> > >> >> > Thats a lot to take in, but a quick suggestion. >> >> > >> >> > Have a parent doc that looks like this: >> >> > { >> >> > =A0 =A0id: node1, >> >> > =A0 =A0 =A0 =A0type: node, >> >> > =A0 =A0location: blah, >> >> > } >> >> > >> >> > and some 'children' docs that look like this >> >> > >> >> > { >> >> > =A0 =A0id: 3232323323223-32323232322-3232, >> >> > =A0 =A0 =A0 =A0timestamp: 1299794532000, >> >> > =A0 =A0type: cpu, >> >> > =A0 =A0node: node1, >> >> > =A0 =A0cpu: 0.94, >> >> > =A0 =A0ccores: 4, >> >> > =A0=A0 =A0acores: 4, >> >> > =A0 =A0 cmemory: 4096, >> >> > =A0 =A0amemory: 1024 >> >> > } >> >> > >> >> > and >> >> > { >> >> > =A0 =A0id: 3232323323223-32323232322-3232, >> >> > =A0 =A0 =A0 =A0timestamp: 1299794532000, >> >> > =A0 =A0type: disk, >> >> > =A0 =A0node: node1, >> >> > =A0 =A0disk: 100000 >> >> > } >> >> > and >> >> > { >> >> > =A0 =A0id: 433432323323223-3232323322332, >> >> > =A0 =A0 =A0 =A0timestamp: 1299794532000, >> >> > =A0 =A0type: netio, >> >> > =A0 =A0node: node1, >> >> > =A0 =A0in: 100, >> >> > =A0 =A0out: 200 >> >> > } >> >> > and >> >> > { >> >> > =A0 =A0id: 323432423432534534-534534-543534534 >> >> > =A0 =A0 =A0 =A0timestamp: 1299794532000, >> >> > =A0 =A0type: generic, >> >> > =A0 =A0node: node1, >> >> > =A0 =A0name: "foo", >> >> > =A0 =A0value: "bar" >> >> > } >> >> > >> >> > create a status view >> >> > "node_status" : function (doc) { >> >> > =A0 =A0 =A0 =A0if (doc.type !=3D 'node') { >> >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0emit([doc.node, doc.type, doc.timest= amp],null); >> >> > =A0 =A0 =A0 =A0} >> >> > } >> >> > >> >> > This allows you to not have to ever update a doc. Just keep >> >> > inserting. >> >> > Couchdb is good at that. >> >> > >> >> > >> >> > >> >> > >> >> > On Tue, Apr 5, 2011 at 7:20 PM, Luis Miguel Silva >> >> > wrote: >> >> >> Thanks for your email Ryan. >> >> >> >> >> >> Let me give you some more information on what i'm trying to do... >> >> >> Essentially, i have to create a "sort of CMDB" system that stores, >> >> >> not >> >> >> only configuration data, but also operational data (so...i guess y= ou >> >> >> could >> >> >> call it a OMDB instead). >> >> >> >> >> >> Either way, my company develops a meta-scheduler that can be used >> >> >> for >> >> >> HPC or Cloud environments. It will guarantee that your resources a= re >> >> >> used >> >> >> the best way possible, maximizing their usage, based on the polici= es >> >> >> you set >> >> >> up in it. >> >> >> >> >> >> To do that, our software needs to be aware of how the environment >> >> >> looks >> >> >> and this is why an OMDB piece is very important for us (as it allo= ws >> >> >> us to >> >> >> store information on the environment). >> >> >> >> >> >> Also, our software talks with external resource managers by a >> >> >> protocol >> >> >> we developed more than a dozen years ago called "WIKI" (not as in >> >> >> "wikipedia" but, WIKI as in the hawayan word for fast). That >> >> >> protocol is >> >> >> heavily based around key/value pairs so this is one of the reasons= i >> >> >> was >> >> >> EXTREMELY excited to find out that, with CouchDB's "view" >> >> >> functionality, i >> >> >> would be able to map document attributes to more meaningful >> >> >> attributes that >> >> >> our software understands (i.e. map the document's "available_cores= " >> >> >> attribute to "ccores" [the "consumable cores" parameter our softwa= re >> >> >> understands]). >> >> >> >> >> >> Another important thing to notice is that resources can be off >> >> >> different types: node (for bare metal nodes), vm (for vms running = on >> >> >> nodes) >> >> >> and storage (we can actually have more data types but those are >> >> >> enough to >> >> >> exemplify what i'm talking about). >> >> >> >> >> >> This is why i created those "big documents" instead of smaller one= s! >> >> >> For instance, each document would represent an entire node (i.e. >> >> >> procs, >> >> >> memory, etc). >> >> >> >> >> >> So my idea was to have an external process initially populate the >> >> >> database with documents representing ALL the nodes we are managing >> >> >> (hence >> >> >> why i started my benchmarks with 100K increments) and OTHER extern= al >> >> >> processes (i.e. other types of resource managers) would update >> >> >> individual >> >> >> attributes in each document. >> >> >> >> >> >> Let's imagine a document with id "node01": >> >> >> These fields would be updated by an agent that collected some of t= he >> >> >> hardware specs:\ >> >> >> =A0 =A0 =A0 =A0ccores: 4 // total cores on machine >> >> >> =A0 =A0 =A0 =A0acores: 4 // available cores on machine >> >> >> =A0 =A0 =A0 =A0cmemory: 4096 // total memory on machine >> >> >> =A0 =A0 =A0 =A0amemory: 1024 // available memory >> >> >> =A0 =A0 =A0 =A0cpuload: 94% >> >> >> This field would be updated by our storage resource manager: >> >> >> =A0 =A0 =A0 =A0GMETRIC["disk"]: 1000000 >> >> >> And, for instance, these fields would be updated by a network >> >> >> resource >> >> >> manager: >> >> >> =A0 =A0 =A0 =A0GMETRIC["NETIO"]: { "in":100, "out":200 } >> >> >> >> >> >> So, as you can see, different processes would manage the same >> >> >> document >> >> >> (just different attributes in it). >> >> >> >> >> >> And the REALLY cool thing about the Views is the fact that our >> >> >> customers could VERY easily adapt the database so that it would >> >> >> store THEIR >> >> >> extra data and shove it in a generic parameter that our software >> >> >> woulder >> >> >> understand [i.e. the GMETRIC parameters are generic metrics...). >> >> >> >> >> >> So, based on these requirements, do you have any suggestions on ho= w >> >> >> we >> >> >> should store our data (keeping its structure easy enough for >> >> >> external >> >> >> consumers to maintain it without having to bust their heads figuri= ng >> >> >> out the >> >> >> logic behind the document attributes)?? :o) >> >> >> >> >> >> Thank you! >> >> >> Luis Miguel Silva >> >> >> >> >> >> On Apr 5, 2011, at 6:45 PM, Ryan Ramage >> >> >> wrote: >> >> >> >> >> >>> Luis, >> >> >>> >> >> >>> Having the rev is very important when you update a doc. It lets y= ou >> >> >>> know that your piece of information is out of date. This is a goo= d >> >> >>> thing.... >> >> >>> >> >> >>> I am wondering if the way you are modeling your data is not leadi= ng >> >> >>> you to do this update with less chance of conflict. See if you ca= n >> >> >>> break your docs into even smaller docs. For example, I noticed fr= om >> >> >>> a >> >> >>> prior post you had a lot of Arrays in your docs. If multiple >> >> >>> processes >> >> >>> are changing that array, you might be better served by making eac= h >> >> >>> element in the array a separate doc. >> >> >>> >> >> >>> Ryan >> >> >>> >> >> >>> On Tue, Apr 5, 2011 at 4:41 PM, Luis Miguel Silva >> >> >>> wrote: >> >> >>>> More or less! >> >> >>>> >> >> >>>> The most common scenario will be: >> >> >>>> - two or more processes writing to the same document, but only t= o >> >> >>>> a >> >> >>>> specific attribute (not overwriting the whole document) >> >> >>>> >> >> >>>> If, by any chance, two processes overwrite the same field, i'm o= k >> >> >>>> with >> >> >>>> the last one always winning. >> >> >>>> >> >> >>>> Thanks, >> >> >>>> Luis >> >> >>>> >> >> >>>> On Tue, Apr 5, 2011 at 4:26 PM, Robert Newson >> >> >>>> wrote: >> >> >>>>> "Ideally, we would be able to update without specifying the _re= v, >> >> >>>>> just >> >> >>>>> posting (or, in this case PUTting) to the document..." >> >> >>>>> >> >> >>>>> So you want to blindly overwrite some unknown data? >> >> >>>>> >> >> >>>>> B. >> >> >>>>> >> >> >>>>> On 5 April 2011 22:57, Zachary Zolton >> >> >>>>> wrote: >> >> >>>>>> Luis, >> >> >>>>>> >> >> >>>>>> Checkout _update handlers: >> >> >>>>>> >> >> >>>>>> http://wiki.apache.org/couchdb/Document_Update_Handlers >> >> >>>>>> >> >> >>>>>> >> >> >>>>>> Cheers, >> >> >>>>>> >> >> >>>>>> Zach >> >> >>>>>> >> >> >>>>>> On Tue, Apr 5, 2011 at 4:46 PM, Luis Miguel Silva >> >> >>>>>> wrote: >> >> >>>>>>> Dear all, >> >> >>>>>>> >> >> >>>>>>> I'm trying to play around with updates and i'm bumping into >> >> >>>>>>> some >> >> >>>>>>> problems. >> >> >>>>>>> >> >> >>>>>>> Let's image we have to clients that poll a document from the >> >> >>>>>>> server at >> >> >>>>>>> the same time and get the same _rev. >> >> >>>>>>> Then one of them updates the doc based on the _rev it got: >> >> >>>>>>> [root@xkitten ~]# curl -X PUT -d >> >> >>>>>>> >> >> >>>>>>> '{"_rev":"3-0d519bcf08130bf784f3c35d79760740","hello2":"fred2= "}' >> >> >>>>>>> http://localhost:5984/benchmark/test?conflicts=3Dtrue >> >> >>>>>>> >> >> >>>>>>> {"ok":true,"id":"test","rev":"4-03640ebafbb4fcaf127844671f8e2= de7"} >> >> >>>>>>> Then another one tries to update the doc based on the same >> >> >>>>>>> exact >> >> >>>>>>> _rev: >> >> >>>>>>> [root@xkitten ~]# curl -X PUT -d >> >> >>>>>>> >> >> >>>>>>> '{"_rev":"3-0d519bcf08130bf784f3c35d79760740","hello3":"fred3= "}' >> >> >>>>>>> http://localhost:5984/benchmark/test?conflicts=3Dtrue >> >> >>>>>>> {"error":"conflict","reason":"Document update conflict."} >> >> >>>>>>> [root@xkitten ~]# >> >> >>>>>>> >> >> >>>>>>> Is there a way to avoid this?! (like...make the update just >> >> >>>>>>> create >> >> >>>>>>> a >> >> >>>>>>> new _rev or something)?? >> >> >>>>>>> >> >> >>>>>>> Ideally, we would be able to update without specifying the >> >> >>>>>>> _rev, >> >> >>>>>>> just >> >> >>>>>>> posting (or, in this case PUTting) to the document... >> >> >>>>>>> >> >> >>>>>>> Thoughts?? >> >> >>>>>>> >> >> >>>>>>> Thank you, >> >> >>>>>>> Luis >> >> >>>>>>> >> >> >>>>>> >> >> >>>>> >> >> >>>> >> >> >> >> >> > >> > >> > > >