couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <...@apache.org>
Subject Re: Write Performance
Date Thu, 08 Jan 2009 12:36:38 GMT

On 8 Jan 2009, at 03:12, Paul Davis wrote:

> On Wed, Jan 7, 2009 at 3:47 PM, Josh Bryan <jbryan@cashnetusa.com>  
> wrote:
>> Thanks for all the replies, I'll upgrade couch and erlang to the  
>> latest and
>> retest.  Yes, this is a single time import, but 70 millions records  
>> at 50 -
>> 60 writes a second doesn't mean a day, it means 2 weeks or more.  I  
>> don't
>> mind throwing extra hardware at the problem, but I just want to  
>> make sure
>> I'm throwing extra hardware in the right place and using existing  
>> hardware
>> as best as I can.  If writes to all DBs are serialized in a single  
>> thread,
>> then if I partition the data into two DBs and fire up two copies of  
>> couch, I
>> should be able to make use of another processor on the same machine?
>
> Each DB should get its own updater process I believe so yes this
> should lead to a speedup.

Depending on how smartly Erlang distributes the DB writer processes over
CPUs you might not even need to run two instances.

Cheers
Jan
--


>
>
>> I'll
>> test this tomorrow along with the newer versions.
>>
>> Thanks,
>> Josh
>>
>> Paul Davis wrote:
>>>
>>> Erlang 5.5.5 is borked. 5.6.x should be ok.
>>>
>>> Also, yes, writes to the database are serialized in a single thread.
>>> For reference, when storing data, are you using the _bulk_docs
>>> interface?
>>>
>>> Also, in trunk the fsync calls are turned off by default now so you
>>> should notice more speedup there.
>>>
>>> Also, if these are archived records, wouldn't this be a single time
>>> cost? Faster is always better, but if it takes a day, is that a big
>>> deal?
>>>
>>> HTH
>>> Paul
>>>
>>> On Wed, Jan 7, 2009 at 2:55 PM, Josh Bryan <jbryan@cashnetusa.com>  
>>> wrote:
>>>
>>>>
>>>> Chris Anderson wrote:
>>>>
>>>>>
>>>>> On Wed, Jan 7, 2009 at 4:37 PM, Josh Bryan <jbryan@cashnetusa.com>
>>>>> wrote:
>>>>>
>>>>>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am looking into CouchDB as a solution to store a bunch  
>>>>>> (approx 70
>>>>>> million) archived documents.  While planning for the import  
>>>>>> process, I
>>>>>> did some benchmarking to figure out how long the import will  
>>>>>> take.  I
>>>>>> get about 50-70 inserts per second on average.  However, when I 

>>>>>> looked
>>>>>> for the bottleneck, I couldn't figure it out.  I am connected  
>>>>>> to the
>>>>>> database via a fast lan and can verify that the network is not
>>>>>> saturated.  I can also verify that disk IO is not saturated.   
>>>>>> The only
>>>>>> clue is that of the 4 cpus on the server, it seems that only  
>>>>>> one is
>>>>>> getting fully loaded.  Also, of the 5 erlang processes I can see
>>>>>> running, only one of them seems to be getting most of the cpu  
>>>>>> time.  I
>>>>>> know that erlang is built with smp enabled, so if it is cpu  
>>>>>> bound, why
>>>>>> can't it make use of the other 3 processors?
>>>>>>
>>>>>> I thought that perhaps there was some internal write lock issue 

>>>>>> per
>>>>>> database that allowed only one thread to write to a db at a  
>>>>>> time, so I
>>>>>> tried running the benchmarks while hitting multiple databases,  
>>>>>> but
>>>>>> still
>>>>>> got the same write rate across the databases.  Is there some  
>>>>>> globally
>>>>>> shared resource in couchdb that limits all writes to a single  
>>>>>> thread?
>>>>>>
>>>>>> Thanks,
>>>>>> Josh
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> Before we can help you diagnose the performance you're seeing,  
>>>>> could
>>>>> you tell us the version of CouchDB and the version of Erlang  
>>>>> that you
>>>>> are using? It wouldn't hurt to describe the hardware in more  
>>>>> detail
>>>>> either.
>>>>>
>>>>>
>>>>>
>>>>
>>>> I am seeing similar results on two systems.
>>>>
>>>> System 1:
>>>> Quad core Intel(R) Xeon(R) CPU 5160  @ 3.00GHz
>>>> 2 GB ram
>>>> Linux 2.6.18-4  -- Debian Lenny
>>>> Erlang (BEAM) emulator version 5.6.3 [source] [64-bit] [smp:4]
>>>> [async-threads:0] [kernel-poll:false]
>>>> couchdb - Apache CouchDB 0.8.0-incubating
>>>>
>>>> System 2:
>>>> Intel(R) Pentium(R) D CPU 3.00GHz
>>>> 3 GB ram
>>>> Erlang (BEAM) emulator version 5.5.5 [source] [async-threads:0]
>>>> [kernel-poll:false]
>>>> couchdb - Apache CouchDB 0.9.0a724455-incubating
>>>>
>>>> Thanks
>>>>
>>>>
>>>>
>>>>
>>
>


Mime
View raw message