incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karel Minařík <karel.mina...@gmail.com>
Subject Re: Bulk Updates in CouchDB
Date Tue, 16 Nov 2010 13:59:48 GMT
> This is probably a deal breaker for my use and I would have thought
> many others. My concern is iterating over a large number of documents
> on a remote server just to do simple updates. It means I need to do
> several HTTP requests (GET/PUT/DELETE) for each document in a set of
> of possibly thousands or tens of thousands.

I don't think that's the case, if I understand your situation. You  
want to delete or update multiple documents based on some criteria.  
You retrieve your docs IDs + revision IDs via map/reduce view or  
fulltext query, and then issue a bulk request [http://wiki.apache.org/couchdb/HTTP_Bulk_Document_API#Modify_Multiple_Documents_With_a_Single_Request

]. This bulk request could easily delete/modify tens of thousands of  
documents, depending on your hardware. It could (and possibly should)  
run in background, via scheduling process (eg. Resque).

> I'm in Australia and the server is in the US (...)

As Jan pointed out, more appropriate could be to do those updates  
locally and replicate.

> I am getting the feeling that CouchDB is great for storing lots of
> information and getting it back in lots of interesting ways but not a
> good fit for typical CRUD stuff that's done in SQL all the time.

I would not put it this way, based on my experience (and many  
grievances with Couch and Ruby). The "typical CRUD" operations in the  
sense of "let's make a blog in 15 minutes" are 100% supported by  
Couch. Depending on the interface for your programming language,  
there's little difference to using, let's say, SQLite.

Of course, ad hoc queries, faceted searches, etc. are all "impossible"  
to do in Couch. However, with a fulltext engine such as CouchDB-Lucene  
[https://github.com/rnewson/couchdb-lucene], they are trivial -- and  
at least according to my experience, more enjoyable and more  
performant then via SQL.

Karel

Mime
View raw message