incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: CouchDB becoming unusable as Database/Views increase in size.
Date Tue, 21 Dec 2010 15:57:30 GMT
On Tue, Dec 21, 2010 at 10:39 AM, Adam Kocoloski <kocolosk@apache.org> wrote:
> On Dec 21, 2010, at 4:55 AM, Bob Clary wrote:
>
>> Large Initial View sizes: Several of my views are initially created with sizes which
are 10-20 times the size of the compacted view. For example, I have one view which when initially
created can take 95G but when compacted uses less than 5G. This has caused several out of
disk space conditions when I've had to regenerate views for the database. I know commodity
disks are relatively cheap these days, but due to my current hosting environment I am using
relatively expensive networked storage. Asking for sufficient storage for my expected database
size was difficult enough, but asking for 10 or more times that amount just to deal with temporary
explosive view sizes is probably a non-starter.
>
> This one is being worked on in https://issues.apache.org/jira/browse/COUCHDB-700 .  Guaranteeing
a minimum batch size results in a smaller index file and also speeds up indexing in many circumstances.
>
>> CouchDB 1.0.x: My experience with attempting to use the 1.0.x branch was a failure
due to the crashing immediately upon view compaction completion which caused the views to
begin indexing from scratch.
>
> I agree with Paul that the timeout dropping a ref counter at the end of view compaction
is a significant bug.  I'm guessing it depends on the particular deployment and size of the
file being deleted.  There have been multiple attempts [1,2] to rewrite the reference counting
system; one of those should probably be merged for 1.2.0.  We might be able to have some
stopgap fix for 1.0.x and 1.1.x.
>
> I also have to agree with Mike and Paul that BigCouch would help you a lot here.  Even
if you use it in a single-node setup the ability to split a large monolithic database into
an arbitrary number of shards can help tremendously when trying to build and compact indexes.
 Regards,
>

I should've mentioned this in my earlier email as well, but I'll
underscore the point that using BigCouch to shard your db on a single
node would still help in splitting the unit of work for a single
database.


> Adam
>
> [1]: https://github.com/tilgovi/couchdb/tree/ets_ref_count
> [2]: https://github.com/cloudant/bigcouch/blob/master/apps/couch/src/couch_file.erl#L483

Mime
View raw message