incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luis Miguel Silva <luismiguelferreirasi...@gmail.com>
Subject Re: Update conflicts?
Date Wed, 06 Apr 2011 07:06:10 GMT
But is there no way to do it server side? :o)
That would be SOOOO much better as i want to maintain a "single view
of the database" (so that everybody querying the same view gets the
same results).
Plus, your approach doesn't allow me to specify my own attribute names
(does it??):

i.e.
		emit(doc._id,
			{
				node: doc._id,
				STATE: doc.secondary_state,
				OS: doc.oslist,
				ALIAS: doc.alias,
				FEATURE: doc.vlans,
				"GMETRIC[numvms]": doc.numvms,
				NETADDR: doc.netaddress,
				VARATTR: { "HVTYPE":doc.hvtype},
				VARIABLE: doc.variables,
				OSLIST: doc.oslist,
				VMOSLIST: doc.vmoslist,
			}
		);

Like i mentioned in a previous document, that is a HUGE deal to us
because the attributes themselves have no meaning to the consumers. So
that is why it is EXTREMELY important for us to shape the information
in a meaningful way on the server side!

p.s. thank you so much for your help.

On Wed, Apr 6, 2011 at 12:56 AM, Anup Bishnoi <pixelsallover@gmail.com> wrote:
> you've already got the answer here
>
> On Wed, Apr 6, 2011 at 12:19 PM, Luis Miguel Silva
> <luismiguelferreirasilva@gmail.com> wrote:
>>
>> Yeah but the above view generates different documents:
>> {"total_rows":4,"offset":0,"rows":[
>>
>> {"id":"92fe8c96f90e21d68a414bbd1700f3d7","key":["node01","cpu",1299794532000,0.94],"value":null},
>>
>> {"id":"92fe8c96f90e21d68a414bbd1700ffee","key":["node01","disk",1299794532000,null],"value":null},
>>
>> {"id":"92fe8c96f90e21d68a414bbd1701180e","key":["node01","generic",1299794532000,null],"value":null},
>>
>> {"id":"92fe8c96f90e21d68a414bbd170109ce","key":["node01","netio",1299794532000,null],"value":null}
>> ]}
>>
> i'm assuming you're making this view query with ajax and you get these
> results.
> now all you need to do is walk through these response items with client side
> js and build the one doc you need! all the pieces required to build the doc
> are already there with you in your client side js
> i'll be happy to keep answering, lets get this solved
>
>>
>> Any way i can return ONE single doc per full result?
>> i.e. something like:
>> {"total_rows":1,"offset":0,"rows":[
>>
>> {"id":"node01","key":"node01","value":{"node":"node01","STATE":"Unknown:sshd","ALIAS":"node01","FEATURE":"[vlan611]","GMETRIC[numvms]":13,"NETADDR":"10.40.130.146","VARATTR":{"HVTYPE":"esx"},"VARIABLE":[{"provision_status":2},{"another_variable":"something"}]}},
>> ]}
>>
>> (or, in other words, joining all the fields from the different
>> documents in one single doc)??
>>
>> On Wed, Apr 6, 2011 at 12:34 AM, Anup Bishnoi <pixelsallover@gmail.com>
>> wrote:
>> > you could join the different pieces of information about the node (which
>> > you
>> > get by one query on the view suggested above) on the page itself with
>> > javascript, instead of asking couch for everything embedded in an html
>> > response
>> >
>> > On Wed, Apr 6, 2011 at 11:53 AM, Luis Miguel Silva
>> > <luismiguelferreirasilva@gmail.com> wrote:
>> >>
>> >> Sorry if my last email was too big :o).
>> >>
>> >> Well, one reason i wanted to avoid doing that is because it didn't
>> >> seem as easy to maintain as my original approach but i'll discuss your
>> >> suggestion with my team to see what they have to say.
>> >> Also, i just couldn't get join to work :o\...
>> >>
>> >> How would you create a view that joins data from those different types
>> >> of documents to create a single complete view of a node?
>> >> I've read the documentation on view joins but simply could not get it
>> >> to work :o\...
>> >>
>> >> Thank you,
>> >> Luis
>> >>
>> >> On Tue, Apr 5, 2011 at 9:12 PM, Ryan Ramage <ryan.ramage@gmail.com>
>> >> wrote:
>> >> > Luis,
>> >> >
>> >> > Thats a lot to take in, but a quick suggestion.
>> >> >
>> >> > Have a parent doc that looks like this:
>> >> > {
>> >> >    id: node1,
>> >> >        type: node,
>> >> >    location: blah,
>> >> > }
>> >> >
>> >> > and some 'children' docs that look like this
>> >> >
>> >> > {
>> >> >    id: 3232323323223-32323232322-3232,
>> >> >        timestamp: 1299794532000,
>> >> >    type: cpu,
>> >> >    node: node1,
>> >> >    cpu: 0.94,
>> >> >    ccores: 4,
>> >> >     acores: 4,
>> >> >     cmemory: 4096,
>> >> >    amemory: 1024
>> >> > }
>> >> >
>> >> > and
>> >> > {
>> >> >    id: 3232323323223-32323232322-3232,
>> >> >        timestamp: 1299794532000,
>> >> >    type: disk,
>> >> >    node: node1,
>> >> >    disk: 100000
>> >> > }
>> >> > and
>> >> > {
>> >> >    id: 433432323323223-3232323322332,
>> >> >        timestamp: 1299794532000,
>> >> >    type: netio,
>> >> >    node: node1,
>> >> >    in: 100,
>> >> >    out: 200
>> >> > }
>> >> > and
>> >> > {
>> >> >    id: 323432423432534534-534534-543534534
>> >> >        timestamp: 1299794532000,
>> >> >    type: generic,
>> >> >    node: node1,
>> >> >    name: "foo",
>> >> >    value: "bar"
>> >> > }
>> >> >
>> >> > create a status view
>> >> > "node_status" : function (doc) {
>> >> >        if (doc.type != 'node') {
>> >> >                emit([doc.node, doc.type, doc.timestamp],null);
>> >> >        }
>> >> > }
>> >> >
>> >> > This allows you to not have to ever update a doc. Just keep
>> >> > inserting.
>> >> > Couchdb is good at that.
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Tue, Apr 5, 2011 at 7:20 PM, Luis Miguel Silva
>> >> > <luismiguelferreirasilva@gmail.com> wrote:
>> >> >> Thanks for your email Ryan.
>> >> >>
>> >> >> Let me give you some more information on what i'm trying to do...
>> >> >> Essentially, i have to create a "sort of CMDB" system that stores,
>> >> >> not
>> >> >> only configuration data, but also operational data (so...i guess
you
>> >> >> could
>> >> >> call it a OMDB instead).
>> >> >>
>> >> >> Either way, my company develops a meta-scheduler that can be used
>> >> >> for
>> >> >> HPC or Cloud environments. It will guarantee that your resources
are
>> >> >> used
>> >> >> the best way possible, maximizing their usage, based on the policies
>> >> >> you set
>> >> >> up in it.
>> >> >>
>> >> >> To do that, our software needs to be aware of how the environment
>> >> >> looks
>> >> >> and this is why an OMDB piece is very important for us (as it allows
>> >> >> us to
>> >> >> store information on the environment).
>> >> >>
>> >> >> Also, our software talks with external resource managers by a
>> >> >> protocol
>> >> >> we developed more than a dozen years ago called "WIKI" (not as
in
>> >> >> "wikipedia" but, WIKI as in the hawayan word for fast). That
>> >> >> protocol is
>> >> >> heavily based around key/value pairs so this is one of the reasons
i
>> >> >> was
>> >> >> EXTREMELY excited to find out that, with CouchDB's "view"
>> >> >> functionality, i
>> >> >> would be able to map document attributes to more meaningful
>> >> >> attributes that
>> >> >> our software understands (i.e. map the document's "available_cores"
>> >> >> attribute to "ccores" [the "consumable cores" parameter our software
>> >> >> understands]).
>> >> >>
>> >> >> Another important thing to notice is that resources can be off
>> >> >> different types: node (for bare metal nodes), vm (for vms running
on
>> >> >> nodes)
>> >> >> and storage (we can actually have more data types but those are
>> >> >> enough to
>> >> >> exemplify what i'm talking about).
>> >> >>
>> >> >> This is why i created those "big documents" instead of smaller
ones!
>> >> >> For instance, each document would represent an entire node (i.e.
>> >> >> procs,
>> >> >> memory, etc).
>> >> >>
>> >> >> So my idea was to have an external process initially populate the
>> >> >> database with documents representing ALL the nodes we are managing
>> >> >> (hence
>> >> >> why i started my benchmarks with 100K increments) and OTHER external
>> >> >> processes (i.e. other types of resource managers) would update
>> >> >> individual
>> >> >> attributes in each document.
>> >> >>
>> >> >> Let's imagine a document with id "node01":
>> >> >> These fields would be updated by an agent that collected some of
the
>> >> >> hardware specs:\
>> >> >>        ccores: 4 // total cores on machine
>> >> >>        acores: 4 // available cores on machine
>> >> >>        cmemory: 4096 // total memory on machine
>> >> >>        amemory: 1024 // available memory
>> >> >>        cpuload: 94%
>> >> >> This field would be updated by our storage resource manager:
>> >> >>        GMETRIC["disk"]: 1000000
>> >> >> And, for instance, these fields would be updated by a network
>> >> >> resource
>> >> >> manager:
>> >> >>        GMETRIC["NETIO"]: { "in":100, "out":200 }
>> >> >>
>> >> >> So, as you can see, different processes would manage the same
>> >> >> document
>> >> >> (just different attributes in it).
>> >> >>
>> >> >> And the REALLY cool thing about the Views is the fact that our
>> >> >> customers could VERY easily adapt the database so that it would
>> >> >> store THEIR
>> >> >> extra data and shove it in a generic parameter that our software
>> >> >> woulder
>> >> >> understand [i.e. the GMETRIC parameters are generic metrics...).
>> >> >>
>> >> >> So, based on these requirements, do you have any suggestions on
how
>> >> >> we
>> >> >> should store our data (keeping its structure easy enough for
>> >> >> external
>> >> >> consumers to maintain it without having to bust their heads figuring
>> >> >> out the
>> >> >> logic behind the document attributes)?? :o)
>> >> >>
>> >> >> Thank you!
>> >> >> Luis Miguel Silva
>> >> >>
>> >> >> On Apr 5, 2011, at 6:45 PM, Ryan Ramage <ryan.ramage@gmail.com>
>> >> >> wrote:
>> >> >>
>> >> >>> Luis,
>> >> >>>
>> >> >>> Having the rev is very important when you update a doc. It
lets you
>> >> >>> know that your piece of information is out of date. This is
a good
>> >> >>> thing....
>> >> >>>
>> >> >>> I am wondering if the way you are modeling your data is not
leading
>> >> >>> you to do this update with less chance of conflict. See if
you can
>> >> >>> break your docs into even smaller docs. For example, I noticed
from
>> >> >>> a
>> >> >>> prior post you had a lot of Arrays in your docs. If multiple
>> >> >>> processes
>> >> >>> are changing that array, you might be better served by making
each
>> >> >>> element in the array a separate doc.
>> >> >>>
>> >> >>> Ryan
>> >> >>>
>> >> >>> On Tue, Apr 5, 2011 at 4:41 PM, Luis Miguel Silva
>> >> >>> <luismiguelferreirasilva@gmail.com> wrote:
>> >> >>>> More or less!
>> >> >>>>
>> >> >>>> The most common scenario will be:
>> >> >>>> - two or more processes writing to the same document, but
only to
>> >> >>>> a
>> >> >>>> specific attribute (not overwriting the whole document)
>> >> >>>>
>> >> >>>> If, by any chance, two processes overwrite the same field,
i'm ok
>> >> >>>> with
>> >> >>>> the last one always winning.
>> >> >>>>
>> >> >>>> Thanks,
>> >> >>>> Luis
>> >> >>>>
>> >> >>>> On Tue, Apr 5, 2011 at 4:26 PM, Robert Newson
>> >> >>>> <robert.newson@gmail.com> wrote:
>> >> >>>>> "Ideally, we would be able to update without specifying
the _rev,
>> >> >>>>> just
>> >> >>>>> posting (or, in this case PUTting) to the document..."
>> >> >>>>>
>> >> >>>>> So you want to blindly overwrite some unknown data?
>> >> >>>>>
>> >> >>>>> B.
>> >> >>>>>
>> >> >>>>> On 5 April 2011 22:57, Zachary Zolton <zachary.zolton@gmail.com>
>> >> >>>>> wrote:
>> >> >>>>>> Luis,
>> >> >>>>>>
>> >> >>>>>> Checkout _update handlers:
>> >> >>>>>>
>> >> >>>>>> http://wiki.apache.org/couchdb/Document_Update_Handlers
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> Cheers,
>> >> >>>>>>
>> >> >>>>>> Zach
>> >> >>>>>>
>> >> >>>>>> On Tue, Apr 5, 2011 at 4:46 PM, Luis Miguel Silva
>> >> >>>>>> <luismiguelferreirasilva@gmail.com> wrote:
>> >> >>>>>>> Dear all,
>> >> >>>>>>>
>> >> >>>>>>> I'm trying to play around with updates and
i'm bumping into
>> >> >>>>>>> some
>> >> >>>>>>> problems.
>> >> >>>>>>>
>> >> >>>>>>> Let's image we have to clients that poll a
document from the
>> >> >>>>>>> server at
>> >> >>>>>>> the same time and get the same _rev.
>> >> >>>>>>> Then one of them updates the doc based on the
_rev it got:
>> >> >>>>>>> [root@xkitten ~]# curl -X PUT -d
>> >> >>>>>>>
>> >> >>>>>>> '{"_rev":"3-0d519bcf08130bf784f3c35d79760740","hello2":"fred2"}'
>> >> >>>>>>> http://localhost:5984/benchmark/test?conflicts=true
>> >> >>>>>>>
>> >> >>>>>>> {"ok":true,"id":"test","rev":"4-03640ebafbb4fcaf127844671f8e2de7"}
>> >> >>>>>>> Then another one tries to update the doc based
on the same
>> >> >>>>>>> exact
>> >> >>>>>>> _rev:
>> >> >>>>>>> [root@xkitten ~]# curl -X PUT -d
>> >> >>>>>>>
>> >> >>>>>>> '{"_rev":"3-0d519bcf08130bf784f3c35d79760740","hello3":"fred3"}'
>> >> >>>>>>> http://localhost:5984/benchmark/test?conflicts=true
>> >> >>>>>>> {"error":"conflict","reason":"Document update
conflict."}
>> >> >>>>>>> [root@xkitten ~]#
>> >> >>>>>>>
>> >> >>>>>>> Is there a way to avoid this?! (like...make
the update just
>> >> >>>>>>> create
>> >> >>>>>>> a
>> >> >>>>>>> new _rev or something)??
>> >> >>>>>>>
>> >> >>>>>>> Ideally, we would be able to update without
specifying the
>> >> >>>>>>> _rev,
>> >> >>>>>>> just
>> >> >>>>>>> posting (or, in this case PUTting) to the document...
>> >> >>>>>>>
>> >> >>>>>>> Thoughts??
>> >> >>>>>>>
>> >> >>>>>>> Thank you,
>> >> >>>>>>> Luis
>> >> >>>>>>>
>> >> >>>>>>
>> >> >>>>>
>> >> >>>>
>> >> >>
>> >> >
>> >
>> >
>
>

Mime
View raw message