couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Genereux <mgene...@gmail.com>
Subject Re: Idea for View & Bulk Insert combination for ad-hoc database changes
Date Mon, 19 Apr 2010 18:21:42 GMT
Thanks!  Okay, I need to reread "_changes" and the "all_or_nothing" on
the bulk_api.  Also "Document Update Handlers" was way too complicated
and wanted a record id for what I was looking to do.  I have a CouchDB
database holding a log and I want to chuck old data easily.  Losing
quick manipulation when switching to CouchDB seems like a barrier of
entry.  I don't know of any situation where I haven't used SQL
(database), sed (ascii document), or other ad hoc concepts to
manipulate records/files/documents.

Thanks for the feedback and it was expedient so thanks for that too.

On Mon, Apr 19, 2010 at 9:45 AM, J Chris Anderson <jchris@apache.org> wrote:
>
> On Apr 18, 2010, at 12:01 PM, Michael Genereux wrote:
>
>> Reposting this from the user list.  Figured it belongs in the dev list.
>>
>
> Sorry to leave you hanging for so long. (Vacation weekend!)
>
> There are 2 reasons CouchDB doesn't have this feature. The important one is that it could
easily give new users the wrong idea. Coming from a RDBMS background you might expect something
like this to be transactional. In Couch, it wouldn't be.
>
> Generally in Couch we've pushed work to the database client. When the client is running
on the same box as CouchDB, there would be 0 savings in processing overhead to run these transforms
on the server, vs using a application script.
>
> The second reason is the amount of code it would require. If someone wants to write it,
there's a decent chance it'd be accepted as a patch (if it were done right.)
>
> Here's how I think it could be implemented (actually a more generic feature that can
be used for lots of things.)
>
> A design document could have a ddoc.changes function, which subscribes to the changes
feed of the database it runs in. The changes function could then do anything it wanted, like
send an email anytime a doc is saved that has a doc.send_me_as_email = true field.
>
> Your use case can be accomplished by having a function that watches to see if any of
the docs have a price that hasn't yet been updated, and change the price. So it could load
the document, change the price, and set a flag on the document that says doc.price_changed
= true. It would ignore any documents that already had price_changed.
>
> This is almost exactly how you would accomplish this at the application level. The only
reason to pull such an operation into the design document is that CouchDB would take care
of keeping it running, and that there would be a standard way to author a changes listener.
>
> I'm not sure the status of it (node.js is a moving target) but Mikeal has something much
like this already on github:
>
> http://github.com/mikeal/node.couch.js/
>
> I'd suggest understanding how this works (and why it's like it is) before thinking about
extending CouchDB. The non-transactional nature of CouchDB means you need to understand the
_changes feed before you can think about doing anything "complete" like updating all docs
that match a given pattern.
>
> Chris
>
>
>> I was doing some ad-hoc UPDATEs and DELETEs on a SQL database the
>> other day and it crossed my mind, how could I do the same on CouchDB?
>> I don't want to write an application to do something so simple.  It
>> seems to me that I should be able to produce a Javascript views, both
>> named and temporary, that in turn produce results that get imported
>> into the CouchDB Bulk Document API as a single transaction.
>>
>> Here's the update example for a 10% increase in prices:
>> function( doc ) {
>> doc.price = doc.price * 1.1;
>> emit( null, doc )
>> }
>>
>> Here's the delete example for old records:
>> function( doc ) {
>> if( doc.year == 2009 ) {
>>  doc._deleted = true;
>>  emit( null, doc )
>> }
>> }
>>
>> Also, much like INSERT ... SELECT notation, this could be used to copy
>> records.  No need for the non-http compliant COPY method that does the
>> same on a single record level.  Changes can be made to the copy on the
>> fly.  Very efficient since the duplication and the update occur in one
>> transaction.
>>
>> Example to duplicate all 'foo' widgets to a set of 'bar' widgets:
>> function( doc ) {
>> if( doc.widget_type == 'foo' ) {
>>  reset( doc );  // helper function that does:
>>                 // delete( doc._id ); delete( doc._rev );
>>                 // and any other special vars in future
>>  doc.widget_type = 'bar'
>>  emit( null, doc )
>> }
>> }
>>
>> I like not having to learn another command, so I reused emit even
>> though bulk api won't use the first field.  If this feature existed,
>> this is how I would have expected to use it knowing that the key
>> parameter has no value in this use of a view.
>>
>> I would love some feedback on this.
>>
>> Michael
>
>

Mime
View raw message