incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ben Browning" <ben...@gmail.com>
Subject Re: another CouchDB evaluation Q
Date Tue, 25 Nov 2008 12:03:51 GMT
I also have a system that will be doing a large number of small
writes. I'll try to answer your questions as best I can based on my
experience so far.

On Mon, Nov 24, 2008 at 11:06 PM, Liam Staskawicz <lstask@gmail.com> wrote:
>  - is there anything intrinsic to the CouchDB design that makes it less than
> ideal for frequent small writes?  I've read a lot about how CouchDB is good
> for read-mostly applications, but not much about what kinds of tradeoffs or
> compromises might be involved in write-heavy use.

>From my initial testing, write speed is not CouchDB's strong point.
With constant small writes your data files will grow very quickly and
need frequent compaction. If you are always under heavy write load
compaction will take much longer and may require you to pause or
throttle writes to allow compaction to finish.

To combat these problems I plan to have multiple write-slaves.
Basically, writes will be load-balanced across the write slaves and
then replicated from the slaves to my read master. If a delay between
data being written and when it can be read is acceptable (and from
your description it sounds like that's fine) then this is the best
approach I see. You can add more write slaves as needed and they act
as a sort of buffer. Set up a mechanism to periodically replicate the
writes from the slaves to the master and then perform your analysis on
the data in the master. Replication should be much faster than
multiple small writes and not require as frequent compaction on the
master. The slaves won't need compaction since you can just create a
new database on a particular slave, add that new database to the load
balancer pool, remove that slave's existing database from the pool,
let that database finish replicating its changes, and delete it.

My application isn't in production yet but I hope to have it going by
the end of the year or very early next year. So, take this advice with
a grain of salt until I actually see the real-world performance. It
sounds good on paper though.


>  - is the notion of a variety of sources (threads) of write data an issue?
>  I've gathered that it's not from my investigations so far, but would like
> to make sure I've understood everything properly.


There are no limitations here that I know of. I've got a web front-end
that's heavily concurrent inserting data and haven't seen any issues.

- Ben

Mime
View raw message