couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neville Franks <s...@surfulater.com>
Subject Re: Bulk Updates in CouchDB
Date Tue, 16 Nov 2010 22:02:09 GMT
Hi Jan,
Thanks for taking the time to respond in detail. I imagine most people
coming for SQL'land will face various brick walls while trying to
learn the new paradigm's of Document Oriented DB's.

I think it is time for me to stop reading and dig my heels in with a
"proof of concept" sample app. No doubt this will be challenging,
however I'm sure I'll learn a lot. Hopefully the batch update methods
you discuss will be satisfactory both from a coding and performance
perspective.

I'm heartened to know that someone else feels that having bulk
editing on the server is a great idea and not some newby stupid
comment on my part.

My overriding interest in CouchDB is its replication capabilities and
offline/online use case. I have not found any other database that does
this so easily and hopefully effectively as CouchDB. My plan was to
implement my own replication capability using SQLite, which I already
use, however this is a complex task, one which I'll happily leave to
others.

I'm sure more questions will follow. The SQLite community is very
active and helpful, and from what I've seen, so is CouchDB.

Tuesday, November 16, 2010, 9:48:21 PM, you wrote:

JL> Hi Neville,

JL> On 16 Nov 2010, at 10:44, Neville Franks wrote:

>> Thanks for the prompt response. I have to say that I am very, very
>> surprised that what seems to me are such basic operations aren't
>> available natively within CouchDB.

JL> It is less that this is a basic operation that isn't supported and
JL> more shows the difference in philosophy between CouchDB and, say,
JL> SQLite.


>> This is probably a deal breaker for my use and I would have thought
>> many others. My concern is iterating over a large number of documents
>> on a remote server just to do simple updates. It means I need to do
>> several HTTP requests (GET/PUT/DELETE) for each document in a set of
>> of possibly thousands or tens of thousands. I'm in Australia and the
>> server is in the US and I would imagine this making an application
>> unusable.

JL> A couple of thoughts:

JL>  - How often does that run? — Of course, the operation will be slower
JL>    than telling the server to update a bunch of fields*, but if it is
JL>    rare occurrence, it may not be that big a deal.

JL>     * CouchDB doesn't have a notion of "fields", hence this operation
JL>       proves a little tricky.

JL>  - CouchDB could handle the bulk updating for you, but it'd essentially
JL>    do the same things you'd do, expect the HTTP overhead. If you set it
JL>    up smartly, you create a view to fetch all documents you want to edit,
JL>    update them and then send back a bulk request with all change requests
JL>    back to CouchDB.

JL>    Again, yes, you could save time transferring that data back and forth
JL>    but the main cost of the operation is more likely disk I/O that will
JL>    happen regardless.

JL>  - One mode of operation for CouchDB is distributed, offline. You could
JL>    have a CouchDB instance locally in Australia and make all your changes
JL>    there in a low-latency situation (but then, you'd probably only two
JL>    requests for 10-100k documents) and later replicate your results to the
JL>    US.

JL>  - Even if CouchDB were to support bulk editing on the server (I think
JL>    it would be a great addition), it wouldn't guarantee any transaction
JL>    semantics. (You didn't name that specifically, but it usually comes up
JL>    quickly in these discussions.) This means that while the update operation
JL>    is in progress, other clients could possibly see some documents in the
JL>    pre and some in the post-state and you app needs to be OK with that.


>> I am getting the feeling that CouchDB is great for storing lots of
>> information and getting it back in lots of interesting ways but not a
>> good fit for typical CRUD stuff that's done in SQL all the time.
>> Please correct me if I'm wrong.

JL> It is plenty good for CRUD operations. Except for the case where you
JL> want to emulate `UPDATE foo SET bar="baz" WHERE qux="quux";`.

JL> The question then is how frequent "all the time" is. I know I've done
JL> my share of bulk updates in SQL land, but the apps I build rarely use
JL> that feature as one of the things they do all the time.

JL> I can see that background processes and cronjobs may have more use for
JL> that particular feature.

JL> --

JL> Come to think of it, I think I'll explore my old idea of "compaction
JL> with a transformation function" again :)

JL> Cheers
JL> Jan


--
Best regards,
  Neville Franks, Author of Surfulater - Your off-line Digital Reference Library
  Soft As It Gets Pty Ltd,  http://www.surfulater.com - Download your copy now.
  Victoria, Australia       Blog: http://blog.surfulater.com 
 


Mime
View raw message