incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bob Clary <...@bclary.com>
Subject CouchDB becoming unusable as Database/Views increase in size.
Date Tue, 21 Dec 2010 09:55:05 GMT
Hi all,

I've been using CouchDB to track the results of testing Firefox and have 
found that as the database and view sizes have increased CouchDB is 
becoming less and less viable as a solution going forward. I don't wish 
to switch to a different database at this time but may not have a choice.

Let me say that I have looked at Jira and found others with similar 
issues although issues have mostly been resolved as invalid or already 
fixed. I do admit that I have a hard time navigating Jira, so it is 
entirely possible I've missed already filed issues. I am not sending 
this email in a threatening fashion that I've seen many times in 
bugzilla where a user says "Fix this or I'm leaving!", but in a plea for 
some help in finding, filing or fixing the appropriate Jira issues which 
need attention.

My database currently has a compacted size of about 37G and contains a 
bit over 9 million documents. You can see examples of the view documents 
in the error log I attached to 
<https://issues.apache.org/jira/browse/COUCHDB-970>.

I am currently using CouchDB 1.0.1 on Centos5 64bit vm with 2CPU and 4G 
RAM running Erlang R14B and configured to use the 64bit js-devel 
libraries. I temporarily tried to use CouchDB 1.0.x to pick up the fix 
for <https://issues.apache.org/jira/browse/COUCHDB-926> which was 
causing me problems but had to revert to 1.0.1 due to crashes upon view 
compaction completion.

Currently, my main issues are:

Slow View generation: Recreating views from scratch is very slow. It can 
take me over 24 hours to recreate some of the larger views. Combined 
with the need to immediately compact them (see Large Initial View sizes) 
recreating views can take my application offline for users for more than 
a day. Trying to switch to 1.0.x and back and having to regenerate views 
after out of space conditions has led to my application being 
unavailable for most of a week.

Large Initial View sizes: Several of my views are initially created with 
sizes which are 10-20 times the size of the compacted view. For example, 
I have one view which when initially created can take 95G but when 
compacted uses less than 5G. This has caused several out of disk space 
conditions when I've had to regenerate views for the database. I know 
commodity disks are relatively cheap these days, but due to my current 
hosting environment I am using relatively expensive networked storage. 
Asking for sufficient storage for my expected database size was 
difficult enough, but asking for 10 or more times that amount just to 
deal with temporary explosive view sizes is probably a non-starter.

CouchDB 1.0.x: My experience with attempting to use the 1.0.x branch was 
a failure due to the crashing immediately upon view compaction 
completion which caused the views to begin indexing from scratch.

I would appreciate it if you would let me know if some of these are 
known issues which have already been filed in Jira or if it would be 
helpful to file new issues and what additional information I can provide 
to help get these issues resolved.

I can also help in making newer releases of SpiderMonkey 1.7 available 
and to help get SpiderMonkey 1.8 and later released if that will help 
the JavaScript performance issues CouchDB may be facing.

bc


Mime
View raw message