Return-Path: X-Original-To: apmail-couchdb-commits-archive@www.apache.org Delivered-To: apmail-couchdb-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0CA71F9AD for ; Wed, 27 Mar 2013 01:03:58 +0000 (UTC) Received: (qmail 67644 invoked by uid 500); 27 Mar 2013 01:03:57 -0000 Delivered-To: apmail-couchdb-commits-archive@couchdb.apache.org Received: (qmail 67603 invoked by uid 500); 27 Mar 2013 01:03:57 -0000 Mailing-List: contact commits-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list commits@couchdb.apache.org Received: (qmail 67590 invoked by uid 99); 27 Mar 2013 01:03:57 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Mar 2013 01:03:57 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.131] (HELO eos.apache.org) (140.211.11.131) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Mar 2013 01:03:55 +0000 Received: from eos.apache.org (localhost [127.0.0.1]) by eos.apache.org (Postfix) with ESMTP id 5A8552DA; Wed, 27 Mar 2013 01:03:35 +0000 (UTC) MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Apache Wiki To: Apache Wiki Date: Wed, 27 Mar 2013 01:03:35 -0000 Message-ID: <20130327010335.37892.73869@eos.apache.org> Subject: =?utf-8?q?=5BCouchdb_Wiki=5D_Update_of_=22Partial=5FUpdates=22_by_MarkHah?= =?utf-8?q?n?= Auto-Submitted: auto-generated X-Virus-Checked: Checked by ClamAV on apache.org Dear Wiki user, You have subscribed to a wiki page or wiki category on "Couchdb Wiki" for c= hange notification. The "Partial_Updates" page has been changed by MarkHahn: http://wiki.apache.org/couchdb/Partial_Updates?action=3Ddiff&rev1=3D6&rev2= =3D7 - This is a work in progress. I accidentally released it but I will now sw= itch to editing it offline. - = <> = <> = - Future versions of Couch DB are expected to have a built-in partial updat= e feature. However, partial updates can be accomplished now with current ve= rsions of Couch using the existing update handler feature. While one may wr= ite their own update handler for this purpose, an example is given here tha= t anyone can use. + Future versions of CouchDB are expected to have a built-in partial update= feature. However, partial updates can be accomplished now with current ver= sions of CouchDB using the existing update handler feature. While one may w= rite their own update handler for this purpose, an example is given here th= at anyone can use. = =3D=3D What is a partial update? =3D=3D = - A partial update is a single HTTP request to Couch that is similar to a n= ormal update (PUT). However the partial update request contains only infor= mation for updating (or deleting) one or more fields (or sub-fields) of a d= oc. = + A partial update is a single HTTP request to CouchDB that is similar to a= normal update (PUT). However the partial update request contains only inf= ormation for updating (or deleting) one or more fields (or sub-fields) of a= doc. = = - =3D=3D=3D Why is a partial update useful? =3D=3D=3D + =3D=3D Why is a partial update useful? =3D=3D = - A partial update is more efficient than a normal full update. Only the c= hange information needs to be sent over HTTP, not the entire doc. In genera= l, changing of a single field in a doc requires reading the doc, changing i= t, and then putting the doc back to the DB. A partial update only needs th= e ID of the doc in order to make the field change. + * A partial update is more efficient than a normal full update. Only th= e change information needs to be sent over HTTP, not the entire doc. In the= general case of a full update, changing a single field in a doc requires r= eading the doc, changing it, and then putting the doc back to the DB. A pa= rtial update only needs the ID of the doc in order to make the field change. = - Also, partial updates allow the code in an app, or multiple apps, to be p= artitioned into multiple pieces where each piece of code only knows about o= ne part of the DOC. In many cases this allows for a better separation of c= oncerns. + * Partial updates allow the code in an app, or multiple apps, to be part= itioned into multiple pieces where each piece of code only knows about one = part of the doc. In many cases this allows for a better separation of conc= erns. As an example, a routine may called with only the doc ID and then the= routine may at any time update part of a doc without ever having a full co= py of the doc. This is especially important when accessing a single DB fro= m multiple apps (or workers) where a single copy of a doc can't be shared. = - As an example, a routine may called with only the doc ID and then the rou= tine may at any time update part of a doc without ever having a full copy o= f the doc. This is especially important when accessing a single DB from mu= ltiple apps (or workers) where a single copy of a DOC can't be shared. + * A partial update, like any other update using the update handler, has = less chance of a 409 collision than the sequence of getting a doc, modifyin= g it, and then putting it back. This is because the internal update is fas= ter than getting and putting over a TCP link. Also, when different non-ove= rlapping parts of a doc are being updated at once, collisions can usually b= e ignored. + = + = + =3D=3D Example update handler =3D=3D + = + While this is just an example, it can be used by anyone for accomplishing= any partial update. The author (I, Mark Hahn) am making this code availab= le to everyone under the standard Apache 2 license. + = + = + =3D=3D=3D Coffescript version =3D=3D=3D + = + {{{ + partialUpdate: (doc, req) -> + if not doc then return [null, JSON.stringify status: 'nodoc'] + for k, v of JSON.parse req.body + if k[0] is '/' + nestedDoc =3D doc + nestedKeys =3D k.split '/' + for nestedKey in nestedKeys[1..-2] + nestedDoc =3D (nestedDoc[nestedKey] ?=3D {}) + k =3D nestedKeys[-1..-1][0] + if v is '__delete__' then delete nestedDoc[k] + else nestedDoc[k] =3D v + continue + if v is '__delete__' then delete doc[k] + else doc[k] =3D v + [doc, JSON.stringify {doc, status: 'updated'}] + }}} + = + =3D=3D=3D Javascript Version =3D=3D=3D + = + {{{ + partialUpdate: function(doc, req) { + if (!doc) { + return [ + null, JSON.stringify({ + status: 'nodoc' + }) + ]; + } + _ref =3D JSON.parse(req.body); + for (k in _ref) { + v =3D _ref[k]; + if (k[0] =3D=3D=3D '/') { + nestedDoc =3D doc; + nestedKeys =3D k.split('/'); + _ref1 =3D nestedKeys.slice(1, -1); + for (_i =3D 0, _len =3D _ref1.length; _i < _len; _i++) { + nestedKey =3D _ref1[_i]; + nestedDoc =3D ((_ref2 =3D nestedDoc[nestedKey]) !=3D null ? _ref2= : nestedDoc[nestedKey] =3D {}); + } + k =3D nestedKeys.slice(-1)[0]; + if (v =3D=3D=3D '__delete__') { + delete nestedDoc[k]; + } else { + nestedDoc[k] =3D v; + } + continue; + } + if (v =3D=3D=3D '__delete__') { + delete doc[k]; + } else { + doc[k] =3D v; + } + } + return [ + doc, JSON.stringify({ + doc: doc, + status: 'updated' + }) + ]; + } + }}} + = + = + =3D=3D Installing the update handler =3D=3D + = + To install the update handler, add the Javascript version of code above t= o the ''updates'' property of a design doc. + = + See [[Document_Update_Handlers]] for general information about update han= dlers. = = = = + =3D=3D Usage instructions =3D=3D = + To quote the instructions at [[Document_Update_Handlers]] ... - {{{ - = + To invoke a handler, use a PUT request against the handler function wit= h a document id: `//_design//_update//` - - - - Minimal Form - = + The update document should be contained in the HTTP request body in JSON = format. When using the partial update handler listed above, the update doc= ument must use a special format. The JSON doc should consist of one hash ob= ject where each property of the object is one ''update command''. - -
-
-
- - - - - - - -
-
-
- - - }}} = - The most important part of the above form is the {{{action=3D"/simpleform= /_design/simpleform/_update/simpleform"}}} which specifies the update handl= er that will receive the POSTed data. + The property key of an update command specifies which field is to be upda= ted. It can be a simple, top-level, property key or it can be a ''path'' i= nto an object with nested objects or arrays. A ''path'' key is indicated b= y a leading slash `/` and multiple parts separated by slashes. = = + Consider this original doc ... - It's broken down into 5 key sections: - = - * the Database {{{db}}} - * the id of the design doc {{{_design/simpleform}}} itself - * {{{_update}}} informs CouchDB that this is an update handler and speci= fies the key within the ddoc that has our handler function - * the final {{{simpleform}}} specifies the update handler name within th= at ddoc, that will receive the POSTed data - = - =3D=3D=3D Submitting the form from the terminal =3D=3D=3D - = - Likely you'll be fiddling with your form quite a bit while working on the= update handler. In this case it makes a lot of sense simply to drive the f= orm directly from the command line. There is more information at [[Commandl= ine_CouchDB]], including Windows tips. - = - {{{ - curl -vX POST http://localhost:5984/simpleform/_design/simpleform/_update= /simpleform \ - --header Content-Type:application/x-www-form-urlencoded \ - --data-urlencode name=3D"John Doe" \ - --data-urlencode email=3D"john@example.org" \ - --data-urlencode phone=3D"+1 (234) 567-890" \ - --data-urlencode url=3D"http://example.org/blog" \ - --data-urlencode message=3D"Y U NO HAZ CHEESBURGER" \ - --data-urlencode submit=3D"submit" - }}} - = - If you are on a unix-like system, you may enjoy the colour output afforde= d by [[http://httpie.org/|httpie]], a python-based curl replacement: - = - {{{ - http --pretty --verbose --style fruity --form \ - post http://localhost:5984/simpleform/_design/simpleform/_update/simp= leform \ - name=3D"John Doe" \ - email=3D"john@example.org" \ - phone=3D"+1 (234) 567-890" \ - url=3D"http://example.org/blog" \ - message=3D"Y U NO HAZ CHEESBURGER" \ - submit=3D"submit" - }}} - = - =3D=3D=3D A basic update handler =3D=3D=3D - = - Here's a simple update handler that will receive the POSTed data as secon= d parameter, and the previous document version if any as the first paramete= r . In our case, using POST, there will be no existing document so this wil= l always be {{{null}}}. Finally this function, to help us debug the handler= , conveniently returns the output of the new document, along with the reque= st and previous doc if any. Obviously this could be HTML or a redirect to a= nother page using custom headers, you will need to customise this to fit. - = - {{{ - function(previous, request) { - = - /* during development and testing you can write data to couch.log - log({"previous": previous}) - log({"request": request}) - */ - = - var doc =3D {} - = - if (!previous) { - // there's no existing document _id as we are not using PUT - // let's use the email address as the _id instead - if (request.form && request.form.email) { - // Extract the JSON-parsed form from the request - // and add in the user's email as docid - doc =3D request.form - doc._id =3D request.form.email - } - } - return [doc, toJSON({"request": request, "previous": previous, "doc":= doc})] - } - }}} - = - =3D=3D=3D Tips and Tricks =3D=3D=3D - = - There are a few points to cover here: - = - * you can use {{{log(=E2=80=A6)}}} to write data to your couch.log file - * Note that there's only ever going to be additional data in the previou= s document if we use a PUT request and provide a URL that includes the docu= ment {{{_id}}}. The POST approach doesn't pass a new {{{_id}}} in so in our= example this will be blank. However the same update handler can be used to= service multiple forms and HTTP verbs. - * You must guard all tests {{{if (request.form && =E2=80=A6}}} otherwise= an exception will occur if a field is missing, and your document will not = be written. - * The returned {{{request}}} object also conveniently includes a valid C= ouchDB {{{UUID}}} if you do not generate one of your own. - * When the function returns, if {{{doc}}} is empty then no data is writt= en to CouchDB. - * The update handler can return almost anything, including custom header= s and body. See [[Document_Update_Handlers]] for more information. - = - = - =3D=3D=3D Results from the form =3D=3D=3D - = - After filling out the form and POSTing it back, you'll receive the result= s from {{{toJSON}}} in your browser. You can use firebug or chrome develope= r tools to view the resulting text in a pretty JSON format, or copy and pas= te it into a terminal and use any of the JSON prettifiers out there, such a= s [[http://lloyd.github.com/yajl/|yajl]], which also has a {{{json_reformat= }}} command distributed with it. - = - Let's take a look in more detail over the three sections returned. The fi= rst section {{{request.info}}} is simply the current DB information, identi= cal to {{{GET $COUCH/db_name}}}. = {{{ { + _id: "xxx" + _rev: "1_something" + field_one: "zzz" + field_two: "Don't bother me." + topLevelObject: { + nestedField_one: "I'm doomed" + nestedField_two: 42 - request: { - info: { - db_name: "simpleform", - doc_count: 3, - doc_del_count: 0, - update_seq: 32, - purge_seq: 0, - compact_running: false, - disk_size: 340069, - data_size: 158491, - instance_start_time: "1340837365780629", - disk_format_version: 6, - committed_update_seq: 32 - }, - =E2=80=A6 - }}} - = - * Next comes the {{{_id}}} of the previous document version, for example= if we were doing a PUT request, this would be filled, along with a {{{_rev= }}} revision as well. - * The requested path is provided in several forms, to make it easier to = match update handlers with document rewrite rules. - * If any {{{query}}} parameters were passsed to the URL, they would also= be accessible. * The full headers are available as usual. - = - {{{ - =E2=80=A6 - id: null, - uuid: "8363428f19b4bc21217044e2b30133ad", - method: "POST", - requested_path: ["simpleform", "_design", "simpleform", "_update", "s= impleform"], - path: ["simpleform", "_design", "simpleform", "_update", "simpleform"= ], - raw_path: "/simpleform/_design/simpleform/_update/simpleform", - query: {}, - headers: { - Accept: "text/html,application/xhtml+xml,application/xml;q=3D0.9,*/= *;q=3D0.8", - Accept - Charset: "ISO-8859-1,utf-8;q=3D0.7,*;q=3D0.3", - Accept - Encoding: "gzip,deflate,sdch", - Accept - Language: "en-US,en;q=3D0.8", - Cache - Control: "max-age=3D0", - Connection: "keep-alive", - Content - Length: "150", - Content - Type: "application/x-www-form-urlencoded", - Host: "localhost:5984", - Origin: "http://localhost:5984", - Referer: "http://localhost:5984/simpleform/_design/simpleform/minim= alform.html", - User - Agent: "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_4) Apple= WebKit/537.1 (KHTML, like Gecko) Chrome/22.0.1188.0 Safari/537.1" - }, - =E2=80=A6 - }}} - = - * The original HTTP {{{body}}} is provided, as well as the originating = {{{peer}}} IP address. - * The url-encoded form parameters have conveniently been extracted and = unencoded into a JS object, which was stringified in our final {{{return}}}= . This {{{form}}} is typically used or transformed in some way to build up = the resulting document object inside the update handler function. - = - {{{ - =E2=80=A6 - body: "name=3DJohn+Doe&phone=3D%2B1+%28234%29+987-654&email=3Djohn%40= example.org&url=3Dhttp%3A%2F%2Fjohn.blogger.com%2F&message=3DSTILL+NO+CHEEZ= BURGER%3F&submit=3Dsubmit", - peer: "127.0.0.1", - form: { - name: "John Doe", - phone: "+1 (234) 987-654", - email: "john@example.org", - url: "http://john.blogger.com/", - message: "STILL NO CHEEZBURGER?", - submit: "submit", - _id: "john@example.org" - }, - =E2=80=A6 - }}} - * {{{cookie}}} and {{{user context}}} are also available here, enabling= such things are inserting usernames, or checking roles before further proc= essing. Ensure that you are not duplicating functionality that should be in= a [[Document_Update_Validation]] function. - * the previous document is empty as we are doing a POST request. - {{{ - =E2=80=A6 - cookie: {}, - userCtx: { - db: "simpleform", - name: null, - roles: [] - }, - secObj: {} - }, - previous: null, - =E2=80=A6 - }}} - = - =3D=3D=3D Emitting the result during testing =3D=3D=3D - = - Finally we emit the resulting document, after our server-side update hand= ler has run. Note that the revision {{{_rev}}} is not yet available, but as= noted in [[Document_Update_Handlers]] it is possible to retrieve this. - = - {{{ - =E2=80=A6 - doc: { - name: "John Doe", - phone: "+1 (234) 987-654", - email: "john@example.org", - url: "http://john.blogger.com/", - message: "STILL NO CHEEZBURGER?", - submit: "submit", - _id: "john@example.org" } } }}} = - =3D=3D Retrieve the Document =3D=3D + A command key of `field_one` is a command to replace "zzz" with the value= of the update command property. A command key of `/topLevelObject/nestedF= ield_two` will replace 42 with the associate property value. = - After the write was successful we can retrieve the new document: + An update command that has the magic property value of `__delete__` will = cause the corresponding original field to be deleted. The command property = `/topLevelObject/nestedField_one: "__delete__"` will delete the entire nest= edField_one property. + = + The update handler will automatically create any objects missing from any= part of a path. For example, the following update command `/topLevelObject= /nestedField_three/dblNest/: 73` will add the missing objects ''nestedField= _three'' and ''dblNest'' to the doc. + = + This update document ... = {{{ - $ curl --silent -HContent-Type:application/json -vXGET http://localhost:= 5984/simpleform/john@example.org | json_reformat - = - * About to connect() to localhost port 5984 (#0) - * Trying ::1... Connection refused - * Trying 127.0.0.1... connected - * Connected to localhost (127.0.0.1) port 5984 (#0) - > GET /simpleform/john@example.org HTTP/1.1 - > User-Agent: curl/7.21.4 (universal-apple-darwin11.0) libcurl/7.21.4 Ope= nSSL/0.9.8r zlib/1.2.5 - > Host: localhost:5984 - > Accept: */* - > Content-Type:application/json - > - < HTTP/1.1 200 OK - < Server: CouchDB/1.3.0a- (Erlang OTP/R15B01) - < ETag: "1-5c316da64caebbebcd0f87364df2a0e7" - < Date: Thu, 28 Jun 2012 11:31:49 GMT - < Content-Type: text/plain; charset=3Dutf-8 - < Content-Length: 228 - < Cache-Control: must-revalidate - < - { [data not shown] - * Connection #0 to host localhost left intact - * Closing connection #0 { + field_one: "AAA", + "/topLevelObject/nestedField_one": "__delete__", + "/topLevelObject/nestedField_two": 99, + "/topLevelObject/nestedField_three/dblNest/": 73 - "_id": "john@example.org", - "_rev": "1-5c316da64caebbebcd0f87364df2a0e7", - "name": "John Doe", - "phone": "+1 (234) 987-654", - "email": "john@example.org", - "url": "http://john.blogger.com/", - "message": "STILL NO CHEEZBURGER?", - "submit": "submit" } }}} = - With the basic update handler above you should have no trouble modifying = it further to perform redirects, extract/merge or modify the data available= to you - new and old document versions, and user/security context as well.= Don't forget that some code is better in a [[Document_Update_Validation]] = rather than in the update handler. The update handler will always act as an= other HTTP POST/PUT, just run conveniently inside the server. They can stil= l suffer from document conflicts, for example. + will change the original document to ... = + {{{ + { + _id: "xxx" + _rev: "2_somethingElse" + field_one: "AAA" + field_two: "Don't bother me." + topLevelObject: { + nestedField_two: 99 + nestedField_three: { + dblNest: 73 + } + } + } + }}} + = + =3D=3D Reserved syntax =3D=3D + = + Note from the usage instructions above that if you use a doc key that sta= rts with a slash or a doc value that is `__delete__` you will have a proble= m using the update handler given above. This could be solved by adding som= e escape mechanism to the handler. Fixing this is left to the reader. + = + = + =3D=3D HTTP 409 error =3D=3D + = + Even though an update with an update handler has less chance of colliding= , it is still possible for the the update request to return an HTTP error 4= 09. This is caused by some other update incrementing the version of the do= c while the update handler is executing. You will need to implement a retr= y mechanism and/or conflict resolution in the code making the HTTP request. +=20