incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: CouchDB becoming unusable as Database/Views increase in size.
Date Tue, 21 Dec 2010 13:55:13 GMT
On Tue, Dec 21, 2010 at 4:55 AM, Bob Clary <bob@bclary.com> wrote:
> Hi all,
>
> I've been using CouchDB to track the results of testing Firefox and have
> found that as the database and view sizes have increased CouchDB is becoming
> less and less viable as a solution going forward. I don't wish to switch to
> a different database at this time but may not have a choice.
>
> Let me say that I have looked at Jira and found others with similar issues
> although issues have mostly been resolved as invalid or already fixed. I do
> admit that I have a hard time navigating Jira, so it is entirely possible
> I've missed already filed issues. I am not sending this email in a
> threatening fashion that I've seen many times in bugzilla where a user says
> "Fix this or I'm leaving!", but in a plea for some help in finding, filing
> or fixing the appropriate Jira issues which need attention.
>
> My database currently has a compacted size of about 37G and contains a bit
> over 9 million documents. You can see examples of the view documents in the
> error log I attached to <https://issues.apache.org/jira/browse/COUCHDB-970>.
>

The immediate thing you could do would be to use BigCouch. Even if
you're using multiple BigCouch nodes on a single machine it should
still help you with initial file sizes and view indexing times.

> I am currently using CouchDB 1.0.1 on Centos5 64bit vm with 2CPU and 4G RAM
> running Erlang R14B and configured to use the 64bit js-devel libraries. I
> temporarily tried to use CouchDB 1.0.x to pick up the fix for
> <https://issues.apache.org/jira/browse/COUCHDB-926> which was causing me
> problems but had to revert to 1.0.1 due to crashes upon view compaction
> completion.
>
> Currently, my main issues are:
>
> Slow View generation: Recreating views from scratch is very slow. It can
> take me over 24 hours to recreate some of the larger views. Combined with
> the need to immediately compact them (see Large Initial View sizes)
> recreating views can take my application offline for users for more than a
> day. Trying to switch to 1.0.x and back and having to regenerate views after
> out of space conditions has led to my application being unavailable for most
> of a week.
>

View generation is definitely slower than I'd like. Again, in the
immediate short term, a switch to BigCouch will help you here because
you can rebuild parts of a view independently which will help with
time and disk space.

> Large Initial View sizes: Several of my views are initially created with
> sizes which are 10-20 times the size of the compacted view. For example, I
> have one view which when initially created can take 95G but when compacted
> uses less than 5G. This has caused several out of disk space conditions when
> I've had to regenerate views for the database. I know commodity disks are
> relatively cheap these days, but due to my current hosting environment I am
> using relatively expensive networked storage. Asking for sufficient storage
> for my expected database size was difficult enough, but asking for 10 or
> more times that amount just to deal with temporary explosive view sizes is
> probably a non-starter.
>

How do you have your views laid out? Remember that a design document
is indexed all at once in a single file, so its possible you could get
seedups and smaller files by splitting them across multiple design
docs.

Also, in 1.0.1 you should have the ability to create a view before
using it. Ie, you create the _design doc with a random id, and build
its views, then rename it to its final destination.

Also, depending on your reductions, if you can, its best to use the
built in reductions.

> CouchDB 1.0.x: My experience with attempting to use the 1.0.x branch was a
> failure due to the crashing immediately upon view compaction completion
> which caused the views to begin indexing from scratch.
>

This is a serious unreported bug. Please add any crash logs to Jira so
we can figure out what's going on here.

> I would appreciate it if you would let me know if some of these are known
> issues which have already been filed in Jira or if it would be helpful to
> file new issues and what additional information I can provide to help get
> these issues resolved.
>
> I can also help in making newer releases of SpiderMonkey 1.7 available and
> to help get SpiderMonkey 1.8 and later released if that will help the
> JavaScript performance issues CouchDB may be facing.
>

I think you'll definitely notice an change with that upgrade. The more
complicated your views are, the more of an impact it should have.

> bc
>
>

HTH,
Paul Davis

Mime
View raw message