From user-return-15651-apmail-couchdb-user-archive=couchdb.apache.org@couchdb.apache.org Wed Apr 06 06:24:18 2011 Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 6077 invoked from network); 6 Apr 2011 06:24:17 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 6 Apr 2011 06:24:17 -0000 Received: (qmail 28484 invoked by uid 500); 6 Apr 2011 06:24:16 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 28450 invoked by uid 500); 6 Apr 2011 06:24:15 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 28436 invoked by uid 99); 6 Apr 2011 06:24:14 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Apr 2011 06:24:14 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of luismiguelferreirasilva@gmail.com designates 74.125.83.180 as permitted sender) Received: from [74.125.83.180] (HELO mail-pv0-f180.google.com) (74.125.83.180) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Apr 2011 06:24:05 +0000 Received: by pvg2 with SMTP id 2so776257pvg.11 for ; Tue, 05 Apr 2011 23:23:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type:content-transfer-encoding; bh=hFlpIG7GISYMsOamWoyzpud5FRuQqNJ6QXp/a1rIKd4=; b=wokurBwIRy6wdRVdWb4s/xD0zfOC6Y5sWbMxcndN1KHAX/Vzz695VFvijBQZls91fe 6KksiyXfrMRRFb+v0Xw+Wx40oIvDRGPysJsGG/9jAcPrIoY3hBaQtaCuwFdFj+1lXJpv ZsINSUALy+Ov/LcW81laN7vu2mlRHRUkAdi4g= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; b=UT7L82HJ8KevzinPeoEd2ZwqyzmDsE8x77lVcbH6gzkO3qnTPs0v3PVZoZw69RZEkZ EMcmi5kPFhWUmWdEpmFXjfFcCXsAqN1OZzY8mOJSAf4S8Mai0iRoAUSqIkOb6f8ptMkC WmcDwJ7TR/zY12OhPIkVOX4kuD1OV4Iqajfvc= Received: by 10.142.148.16 with SMTP id v16mr575275wfd.447.1302071024090; Tue, 05 Apr 2011 23:23:44 -0700 (PDT) MIME-Version: 1.0 Received: by 10.68.52.168 with HTTP; Tue, 5 Apr 2011 23:23:24 -0700 (PDT) In-Reply-To: References: From: Luis Miguel Silva Date: Wed, 6 Apr 2011 00:23:24 -0600 Message-ID: Subject: Re: Update conflicts? To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Sorry if my last email was too big :o). Well, one reason i wanted to avoid doing that is because it didn't seem as easy to maintain as my original approach but i'll discuss your suggestion with my team to see what they have to say. Also, i just couldn't get join to work :o\... How would you create a view that joins data from those different types of documents to create a single complete view of a node? I've read the documentation on view joins but simply could not get it to work :o\... Thank you, Luis On Tue, Apr 5, 2011 at 9:12 PM, Ryan Ramage wrote: > Luis, > > Thats a lot to take in, but a quick suggestion. > > Have a parent doc that looks like this: > { > =A0 =A0id: node1, > =A0 =A0 =A0 =A0type: node, > =A0 =A0location: blah, > } > > and some 'children' docs that look like this > > { > =A0 =A0id: 3232323323223-32323232322-3232, > =A0 =A0 =A0 =A0timestamp: 1299794532000, > =A0 =A0type: cpu, > =A0 =A0node: node1, > =A0 =A0cpu: 0.94, > =A0 =A0ccores: 4, > =A0=A0 =A0acores: 4, > =A0 =A0 cmemory: 4096, > =A0 =A0amemory: 1024 > } > > and > { > =A0 =A0id: 3232323323223-32323232322-3232, > =A0 =A0 =A0 =A0timestamp: 1299794532000, > =A0 =A0type: disk, > =A0 =A0node: node1, > =A0 =A0disk: 100000 > } > and > { > =A0 =A0id: 433432323323223-3232323322332, > =A0 =A0 =A0 =A0timestamp: 1299794532000, > =A0 =A0type: netio, > =A0 =A0node: node1, > =A0 =A0in: 100, > =A0 =A0out: 200 > } > and > { > =A0 =A0id: 323432423432534534-534534-543534534 > =A0 =A0 =A0 =A0timestamp: 1299794532000, > =A0 =A0type: generic, > =A0 =A0node: node1, > =A0 =A0name: "foo", > =A0 =A0value: "bar" > } > > create a status view > "node_status" : function (doc) { > =A0 =A0 =A0 =A0if (doc.type !=3D 'node') { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0emit([doc.node, doc.type, doc.timestamp],n= ull); > =A0 =A0 =A0 =A0} > } > > This allows you to not have to ever update a doc. Just keep inserting. > Couchdb is good at that. > > > > > On Tue, Apr 5, 2011 at 7:20 PM, Luis Miguel Silva > wrote: >> Thanks for your email Ryan. >> >> Let me give you some more information on what i'm trying to do... >> Essentially, i have to create a "sort of CMDB" system that stores, not o= nly configuration data, but also operational data (so...i guess you could c= all it a OMDB instead). >> >> Either way, my company develops a meta-scheduler that can be used for HP= C or Cloud environments. It will guarantee that your resources are used the= best way possible, maximizing their usage, based on the policies you set u= p in it. >> >> To do that, our software needs to be aware of how the environment looks = and this is why an OMDB piece is very important for us (as it allows us to = store information on the environment). >> >> Also, our software talks with external resource managers by a protocol w= e developed more than a dozen years ago called "WIKI" (not as in "wikipedia= " but, WIKI as in the hawayan word for fast). That protocol is heavily base= d around key/value pairs so this is one of the reasons i was EXTREMELY exci= ted to find out that, with CouchDB's "view" functionality, i would be able = to map document attributes to more meaningful attributes that our software = understands (i.e. map the document's "available_cores" attribute to "ccores= " [the "consumable cores" parameter our software understands]). >> >> Another important thing to notice is that resources can be off different= types: node (for bare metal nodes), vm (for vms running on nodes) and stor= age (we can actually have more data types but those are enough to exemplify= what i'm talking about). >> >> This is why i created those "big documents" instead of smaller ones! >> For instance, each document would represent an entire node (i.e. procs, = memory, etc). >> >> So my idea was to have an external process initially populate the databa= se with documents representing ALL the nodes we are managing (hence why i s= tarted my benchmarks with 100K increments) and OTHER external processes (i.= e. other types of resource managers) would update individual attributes in = each document. >> >> Let's imagine a document with id "node01": >> These fields would be updated by an agent that collected some of the har= dware specs:\ >> =A0 =A0 =A0 =A0ccores: 4 // total cores on machine >> =A0 =A0 =A0 =A0acores: 4 // available cores on machine >> =A0 =A0 =A0 =A0cmemory: 4096 // total memory on machine >> =A0 =A0 =A0 =A0amemory: 1024 // available memory >> =A0 =A0 =A0 =A0cpuload: 94% >> This field would be updated by our storage resource manager: >> =A0 =A0 =A0 =A0GMETRIC["disk"]: 1000000 >> And, for instance, these fields would be updated by a network resource m= anager: >> =A0 =A0 =A0 =A0GMETRIC["NETIO"]: { "in":100, "out":200 } >> >> So, as you can see, different processes would manage the same document (= just different attributes in it). >> >> And the REALLY cool thing about the Views is the fact that our customers= could VERY easily adapt the database so that it would store THEIR extra da= ta and shove it in a generic parameter that our software woulder understand= [i.e. the GMETRIC parameters are generic metrics...). >> >> So, based on these requirements, do you have any suggestions on how we s= hould store our data (keeping its structure easy enough for external consum= ers to maintain it without having to bust their heads figuring out the logi= c behind the document attributes)?? :o) >> >> Thank you! >> Luis Miguel Silva >> >> On Apr 5, 2011, at 6:45 PM, Ryan Ramage wrote: >> >>> Luis, >>> >>> Having the rev is very important when you update a doc. It lets you >>> know that your piece of information is out of date. This is a good >>> thing.... >>> >>> I am wondering if the way you are modeling your data is not leading >>> you to do this update with less chance of conflict. See if you can >>> break your docs into even smaller docs. For example, I noticed from a >>> prior post you had a lot of Arrays in your docs. If multiple processes >>> are changing that array, you might be better served by making each >>> element in the array a separate doc. >>> >>> Ryan >>> >>> On Tue, Apr 5, 2011 at 4:41 PM, Luis Miguel Silva >>> wrote: >>>> More or less! >>>> >>>> The most common scenario will be: >>>> - two or more processes writing to the same document, but only to a >>>> specific attribute (not overwriting the whole document) >>>> >>>> If, by any chance, two processes overwrite the same field, i'm ok with >>>> the last one always winning. >>>> >>>> Thanks, >>>> Luis >>>> >>>> On Tue, Apr 5, 2011 at 4:26 PM, Robert Newson wrote: >>>>> "Ideally, we would be able to update without specifying the _rev, jus= t >>>>> posting (or, in this case PUTting) to the document..." >>>>> >>>>> So you want to blindly overwrite some unknown data? >>>>> >>>>> B. >>>>> >>>>> On 5 April 2011 22:57, Zachary Zolton wrot= e: >>>>>> Luis, >>>>>> >>>>>> Checkout _update handlers: >>>>>> >>>>>> http://wiki.apache.org/couchdb/Document_Update_Handlers >>>>>> >>>>>> >>>>>> Cheers, >>>>>> >>>>>> Zach >>>>>> >>>>>> On Tue, Apr 5, 2011 at 4:46 PM, Luis Miguel Silva >>>>>> wrote: >>>>>>> Dear all, >>>>>>> >>>>>>> I'm trying to play around with updates and i'm bumping into some pr= oblems. >>>>>>> >>>>>>> Let's image we have to clients that poll a document from the server= at >>>>>>> the same time and get the same _rev. >>>>>>> Then one of them updates the doc based on the _rev it got: >>>>>>> [root@xkitten ~]# curl -X PUT -d >>>>>>> '{"_rev":"3-0d519bcf08130bf784f3c35d79760740","hello2":"fred2"}' >>>>>>> http://localhost:5984/benchmark/test?conflicts=3Dtrue >>>>>>> {"ok":true,"id":"test","rev":"4-03640ebafbb4fcaf127844671f8e2de7"} >>>>>>> Then another one tries to update the doc based on the same exact _r= ev: >>>>>>> [root@xkitten ~]# curl -X PUT -d >>>>>>> '{"_rev":"3-0d519bcf08130bf784f3c35d79760740","hello3":"fred3"}' >>>>>>> http://localhost:5984/benchmark/test?conflicts=3Dtrue >>>>>>> {"error":"conflict","reason":"Document update conflict."} >>>>>>> [root@xkitten ~]# >>>>>>> >>>>>>> Is there a way to avoid this?! (like...make the update just create = a >>>>>>> new _rev or something)?? >>>>>>> >>>>>>> Ideally, we would be able to update without specifying the _rev, ju= st >>>>>>> posting (or, in this case PUTting) to the document... >>>>>>> >>>>>>> Thoughts?? >>>>>>> >>>>>>> Thank you, >>>>>>> Luis >>>>>>> >>>>>> >>>>> >>>> >> >