incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <rnew...@apache.org>
Subject Re: Slow db w/ lots of replications
Date Thu, 19 Jan 2012 11:02:18 GMT
Hi Peter,

When you say "not running out of disk" are you referring to running
out of free space or are you also saying you are not using all your
disk I/O? How close are you to maxxing out your CPU's? Finally, what
version of Erlang are you using? There are significant improvements in
the R14 series and beyond, and in CouchDB releases since the 1.0.x
series.

B.

On 19 January 2012 07:35, CGS <cgsmcmlxxv@gmail.com> wrote:
> Hi Pete,
>
> I think the bottleneck is your harddisk because each client tries to access
> the same memory location. To speed up the things, I would choose one (or a
> combination) of the following options:
> 1. Implementing a replication queue (a round-robin or whatever to order the
> access and not using continuous replications), but that won't increase too
> much the performance unless the change in revision is quite frequent.
> 2. Using ramfs (if possible) for the databases accessed by clients with
> dumping on harddisk whenever you can (it may not be so straight foreword
> the implementation and it increases the risk of data loss).
> 3. Making a dedicated database for a group of maximum n clients (connection
> server database to server database done by a bidirectional replicator, so,
> decreasing the number of replicators per database, but increasing
> considerably the amount of used space on the harddisk).
> 4. Instead of using the direct replication, one can devise a buffering
> application from where the data are transmitted to the clients and main
> server.
> 5. Sharding the database (not sure if it really helps, but having a
> non-centralized database usually helps to release some stress from only one
> database).
>
> I know these options are more "hacks" and "workarounds", but unless someone
> has a better idea, one (or a combination of some) of these may be a good
> start to improve the performance. Also, you may want to update to version
> 1.2.0 (when will be available). I've seen pretty nice new features there
> (according to wiki).
>
> Hope this message will give you some ideas.
>
> CGS
>
>
>
>
> On Thu, Jan 19, 2012 at 7:09 AM, Pete Vander Giessen <petevg@gmail.com>wrote:
>
>> Hi All,
>>
>> We're running couch vers. 1.0.x, with a few patches applied to bring
>> in some newer features (such as oauth), and we're getting slow
>> performance under the circumstances outlined below, even after
>> following the advice on "Performance" on the wiki. Here's what we have
>> setup, and here's what we're seeing:
>>
>> We have one "master" server, with a lot of databases, and a bunch of
>> clients attached to it, each running their own copy of couch. The
>> clients setup push and pull replications with the master for 1 or 2 of
>> the databases. There's a fair amount of traffic in the replications,
>> with some documents that get updated every few seconds or minutes
>> (we're working on reducing the number of documents that receive such
>> frequent updates). The clients run compaction once every half hour or
>> so.
>>
>> With 6-8 clients attached to the master, each client replicating from
>> 2 databases, everything is hunky dory. Data replicates quickly across
>> our little cloud, and futon on the master server is responsive. Once
>> we get up to 10 clients, things begin to bog down. When we approach 20
>> clients, futon is near unusable, and clients get updates quite slowly.
>> Simply writing a file directly to a database on the master server can
>> take seconds to minutes.
>>
>> I've increased the number of internal erlang processes, increased the
>> worker threads, and increased the number http ports and pipeline the
>> master has available for replications (following the formula given on
>> the wiki). The machine is not running out of memory or disk, though it
>> uses a lot of CPU (never quite maxes out).
>>
>> The weird thing is that some of the bottleneck seems to happen at a
>> per database level. If I hook 10 clients up to one database, and 10
>> clients up to another database, I don't see the same slowdown as I see
>> when I hook 20 clients up to one database.
>>
>> I'm not seeing errors in our logs.
>>
>> Are there any per-database bottlenecks that I can address? Does
>> anybody have Performance tweaking advice outside of the stuff on the
>> wiki?
>>
>> Thank you,
>>
>> ~PeteVG
>>
>> "The problem with Internet quotations is that many are not genuine."
>> ~ Abraham Lincoln
>>

Mime
View raw message