incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luis Miguel Silva <luismiguelferreirasi...@gmail.com>
Subject Re: Update conflicts?
Date Wed, 06 Apr 2011 06:49:00 GMT
Yeah but the above view generates different documents:
{"total_rows":4,"offset":0,"rows":[
{"id":"92fe8c96f90e21d68a414bbd1700f3d7","key":["node01","cpu",1299794532000,0.94],"value":null},
{"id":"92fe8c96f90e21d68a414bbd1700ffee","key":["node01","disk",1299794532000,null],"value":null},
{"id":"92fe8c96f90e21d68a414bbd1701180e","key":["node01","generic",1299794532000,null],"value":null},
{"id":"92fe8c96f90e21d68a414bbd170109ce","key":["node01","netio",1299794532000,null],"value":null}
]}

Any way i can return ONE single doc per full result?
i.e. something like:
{"total_rows":1,"offset":0,"rows":[
{"id":"node01","key":"node01","value":{"node":"node01","STATE":"Unknown:sshd","ALIAS":"node01","FEATURE":"[vlan611]","GMETRIC[numvms]":13,"NETADDR":"10.40.130.146","VARATTR":{"HVTYPE":"esx"},"VARIABLE":[{"provision_status":2},{"another_variable":"something"}]}},
]}

(or, in other words, joining all the fields from the different
documents in one single doc)??

On Wed, Apr 6, 2011 at 12:34 AM, Anup Bishnoi <pixelsallover@gmail.com> wrote:
> you could join the different pieces of information about the node (which you
> get by one query on the view suggested above) on the page itself with
> javascript, instead of asking couch for everything embedded in an html
> response
>
> On Wed, Apr 6, 2011 at 11:53 AM, Luis Miguel Silva
> <luismiguelferreirasilva@gmail.com> wrote:
>>
>> Sorry if my last email was too big :o).
>>
>> Well, one reason i wanted to avoid doing that is because it didn't
>> seem as easy to maintain as my original approach but i'll discuss your
>> suggestion with my team to see what they have to say.
>> Also, i just couldn't get join to work :o\...
>>
>> How would you create a view that joins data from those different types
>> of documents to create a single complete view of a node?
>> I've read the documentation on view joins but simply could not get it
>> to work :o\...
>>
>> Thank you,
>> Luis
>>
>> On Tue, Apr 5, 2011 at 9:12 PM, Ryan Ramage <ryan.ramage@gmail.com> wrote:
>> > Luis,
>> >
>> > Thats a lot to take in, but a quick suggestion.
>> >
>> > Have a parent doc that looks like this:
>> > {
>> >    id: node1,
>> >        type: node,
>> >    location: blah,
>> > }
>> >
>> > and some 'children' docs that look like this
>> >
>> > {
>> >    id: 3232323323223-32323232322-3232,
>> >        timestamp: 1299794532000,
>> >    type: cpu,
>> >    node: node1,
>> >    cpu: 0.94,
>> >    ccores: 4,
>> >     acores: 4,
>> >     cmemory: 4096,
>> >    amemory: 1024
>> > }
>> >
>> > and
>> > {
>> >    id: 3232323323223-32323232322-3232,
>> >        timestamp: 1299794532000,
>> >    type: disk,
>> >    node: node1,
>> >    disk: 100000
>> > }
>> > and
>> > {
>> >    id: 433432323323223-3232323322332,
>> >        timestamp: 1299794532000,
>> >    type: netio,
>> >    node: node1,
>> >    in: 100,
>> >    out: 200
>> > }
>> > and
>> > {
>> >    id: 323432423432534534-534534-543534534
>> >        timestamp: 1299794532000,
>> >    type: generic,
>> >    node: node1,
>> >    name: "foo",
>> >    value: "bar"
>> > }
>> >
>> > create a status view
>> > "node_status" : function (doc) {
>> >        if (doc.type != 'node') {
>> >                emit([doc.node, doc.type, doc.timestamp],null);
>> >        }
>> > }
>> >
>> > This allows you to not have to ever update a doc. Just keep inserting.
>> > Couchdb is good at that.
>> >
>> >
>> >
>> >
>> > On Tue, Apr 5, 2011 at 7:20 PM, Luis Miguel Silva
>> > <luismiguelferreirasilva@gmail.com> wrote:
>> >> Thanks for your email Ryan.
>> >>
>> >> Let me give you some more information on what i'm trying to do...
>> >> Essentially, i have to create a "sort of CMDB" system that stores, not
>> >> only configuration data, but also operational data (so...i guess you could
>> >> call it a OMDB instead).
>> >>
>> >> Either way, my company develops a meta-scheduler that can be used for
>> >> HPC or Cloud environments. It will guarantee that your resources are used
>> >> the best way possible, maximizing their usage, based on the policies you
set
>> >> up in it.
>> >>
>> >> To do that, our software needs to be aware of how the environment looks
>> >> and this is why an OMDB piece is very important for us (as it allows us
to
>> >> store information on the environment).
>> >>
>> >> Also, our software talks with external resource managers by a protocol
>> >> we developed more than a dozen years ago called "WIKI" (not as in
>> >> "wikipedia" but, WIKI as in the hawayan word for fast). That protocol is
>> >> heavily based around key/value pairs so this is one of the reasons i was
>> >> EXTREMELY excited to find out that, with CouchDB's "view" functionality,
i
>> >> would be able to map document attributes to more meaningful attributes that
>> >> our software understands (i.e. map the document's "available_cores"
>> >> attribute to "ccores" [the "consumable cores" parameter our software
>> >> understands]).
>> >>
>> >> Another important thing to notice is that resources can be off
>> >> different types: node (for bare metal nodes), vm (for vms running on nodes)
>> >> and storage (we can actually have more data types but those are enough to
>> >> exemplify what i'm talking about).
>> >>
>> >> This is why i created those "big documents" instead of smaller ones!
>> >> For instance, each document would represent an entire node (i.e. procs,
>> >> memory, etc).
>> >>
>> >> So my idea was to have an external process initially populate the
>> >> database with documents representing ALL the nodes we are managing (hence
>> >> why i started my benchmarks with 100K increments) and OTHER external
>> >> processes (i.e. other types of resource managers) would update individual
>> >> attributes in each document.
>> >>
>> >> Let's imagine a document with id "node01":
>> >> These fields would be updated by an agent that collected some of the
>> >> hardware specs:\
>> >>        ccores: 4 // total cores on machine
>> >>        acores: 4 // available cores on machine
>> >>        cmemory: 4096 // total memory on machine
>> >>        amemory: 1024 // available memory
>> >>        cpuload: 94%
>> >> This field would be updated by our storage resource manager:
>> >>        GMETRIC["disk"]: 1000000
>> >> And, for instance, these fields would be updated by a network resource
>> >> manager:
>> >>        GMETRIC["NETIO"]: { "in":100, "out":200 }
>> >>
>> >> So, as you can see, different processes would manage the same document
>> >> (just different attributes in it).
>> >>
>> >> And the REALLY cool thing about the Views is the fact that our
>> >> customers could VERY easily adapt the database so that it would store THEIR
>> >> extra data and shove it in a generic parameter that our software woulder
>> >> understand [i.e. the GMETRIC parameters are generic metrics...).
>> >>
>> >> So, based on these requirements, do you have any suggestions on how we
>> >> should store our data (keeping its structure easy enough for external
>> >> consumers to maintain it without having to bust their heads figuring out
the
>> >> logic behind the document attributes)?? :o)
>> >>
>> >> Thank you!
>> >> Luis Miguel Silva
>> >>
>> >> On Apr 5, 2011, at 6:45 PM, Ryan Ramage <ryan.ramage@gmail.com> wrote:
>> >>
>> >>> Luis,
>> >>>
>> >>> Having the rev is very important when you update a doc. It lets you
>> >>> know that your piece of information is out of date. This is a good
>> >>> thing....
>> >>>
>> >>> I am wondering if the way you are modeling your data is not leading
>> >>> you to do this update with less chance of conflict. See if you can
>> >>> break your docs into even smaller docs. For example, I noticed from
a
>> >>> prior post you had a lot of Arrays in your docs. If multiple processes
>> >>> are changing that array, you might be better served by making each
>> >>> element in the array a separate doc.
>> >>>
>> >>> Ryan
>> >>>
>> >>> On Tue, Apr 5, 2011 at 4:41 PM, Luis Miguel Silva
>> >>> <luismiguelferreirasilva@gmail.com> wrote:
>> >>>> More or less!
>> >>>>
>> >>>> The most common scenario will be:
>> >>>> - two or more processes writing to the same document, but only to
a
>> >>>> specific attribute (not overwriting the whole document)
>> >>>>
>> >>>> If, by any chance, two processes overwrite the same field, i'm ok
>> >>>> with
>> >>>> the last one always winning.
>> >>>>
>> >>>> Thanks,
>> >>>> Luis
>> >>>>
>> >>>> On Tue, Apr 5, 2011 at 4:26 PM, Robert Newson
>> >>>> <robert.newson@gmail.com> wrote:
>> >>>>> "Ideally, we would be able to update without specifying the
_rev,
>> >>>>> just
>> >>>>> posting (or, in this case PUTting) to the document..."
>> >>>>>
>> >>>>> So you want to blindly overwrite some unknown data?
>> >>>>>
>> >>>>> B.
>> >>>>>
>> >>>>> On 5 April 2011 22:57, Zachary Zolton <zachary.zolton@gmail.com>
>> >>>>> wrote:
>> >>>>>> Luis,
>> >>>>>>
>> >>>>>> Checkout _update handlers:
>> >>>>>>
>> >>>>>> http://wiki.apache.org/couchdb/Document_Update_Handlers
>> >>>>>>
>> >>>>>>
>> >>>>>> Cheers,
>> >>>>>>
>> >>>>>> Zach
>> >>>>>>
>> >>>>>> On Tue, Apr 5, 2011 at 4:46 PM, Luis Miguel Silva
>> >>>>>> <luismiguelferreirasilva@gmail.com> wrote:
>> >>>>>>> Dear all,
>> >>>>>>>
>> >>>>>>> I'm trying to play around with updates and i'm bumping
into some
>> >>>>>>> problems.
>> >>>>>>>
>> >>>>>>> Let's image we have to clients that poll a document
from the
>> >>>>>>> server at
>> >>>>>>> the same time and get the same _rev.
>> >>>>>>> Then one of them updates the doc based on the _rev it
got:
>> >>>>>>> [root@xkitten ~]# curl -X PUT -d
>> >>>>>>> '{"_rev":"3-0d519bcf08130bf784f3c35d79760740","hello2":"fred2"}'
>> >>>>>>> http://localhost:5984/benchmark/test?conflicts=true
>> >>>>>>> {"ok":true,"id":"test","rev":"4-03640ebafbb4fcaf127844671f8e2de7"}
>> >>>>>>> Then another one tries to update the doc based on the
same exact
>> >>>>>>> _rev:
>> >>>>>>> [root@xkitten ~]# curl -X PUT -d
>> >>>>>>> '{"_rev":"3-0d519bcf08130bf784f3c35d79760740","hello3":"fred3"}'
>> >>>>>>> http://localhost:5984/benchmark/test?conflicts=true
>> >>>>>>> {"error":"conflict","reason":"Document update conflict."}
>> >>>>>>> [root@xkitten ~]#
>> >>>>>>>
>> >>>>>>> Is there a way to avoid this?! (like...make the update
just create
>> >>>>>>> a
>> >>>>>>> new _rev or something)??
>> >>>>>>>
>> >>>>>>> Ideally, we would be able to update without specifying
the _rev,
>> >>>>>>> just
>> >>>>>>> posting (or, in this case PUTting) to the document...
>> >>>>>>>
>> >>>>>>> Thoughts??
>> >>>>>>>
>> >>>>>>> Thank you,
>> >>>>>>> Luis
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>
>> >
>
>

Mime
View raw message