On 8 Jan 2009, at 03:12, Paul Davis wrote:
> On Wed, Jan 7, 2009 at 3:47 PM, Josh Bryan <jbryan@cashnetusa.com>
> wrote:
>> Thanks for all the replies, I'll upgrade couch and erlang to the
>> latest and
>> retest. Yes, this is a single time import, but 70 millions records
>> at 50 -
>> 60 writes a second doesn't mean a day, it means 2 weeks or more. I
>> don't
>> mind throwing extra hardware at the problem, but I just want to
>> make sure
>> I'm throwing extra hardware in the right place and using existing
>> hardware
>> as best as I can. If writes to all DBs are serialized in a single
>> thread,
>> then if I partition the data into two DBs and fire up two copies of
>> couch, I
>> should be able to make use of another processor on the same machine?
>
> Each DB should get its own updater process I believe so yes this
> should lead to a speedup.
Depending on how smartly Erlang distributes the DB writer processes over
CPUs you might not even need to run two instances.
Cheers
Jan
--
>
>
>> I'll
>> test this tomorrow along with the newer versions.
>>
>> Thanks,
>> Josh
>>
>> Paul Davis wrote:
>>>
>>> Erlang 5.5.5 is borked. 5.6.x should be ok.
>>>
>>> Also, yes, writes to the database are serialized in a single thread.
>>> For reference, when storing data, are you using the _bulk_docs
>>> interface?
>>>
>>> Also, in trunk the fsync calls are turned off by default now so you
>>> should notice more speedup there.
>>>
>>> Also, if these are archived records, wouldn't this be a single time
>>> cost? Faster is always better, but if it takes a day, is that a big
>>> deal?
>>>
>>> HTH
>>> Paul
>>>
>>> On Wed, Jan 7, 2009 at 2:55 PM, Josh Bryan <jbryan@cashnetusa.com>
>>> wrote:
>>>
>>>>
>>>> Chris Anderson wrote:
>>>>
>>>>>
>>>>> On Wed, Jan 7, 2009 at 4:37 PM, Josh Bryan <jbryan@cashnetusa.com>
>>>>> wrote:
>>>>>
>>>>>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am looking into CouchDB as a solution to store a bunch
>>>>>> (approx 70
>>>>>> million) archived documents. While planning for the import
>>>>>> process, I
>>>>>> did some benchmarking to figure out how long the import will
>>>>>> take. I
>>>>>> get about 50-70 inserts per second on average. However, when I
>>>>>> looked
>>>>>> for the bottleneck, I couldn't figure it out. I am connected
>>>>>> to the
>>>>>> database via a fast lan and can verify that the network is not
>>>>>> saturated. I can also verify that disk IO is not saturated.
>>>>>> The only
>>>>>> clue is that of the 4 cpus on the server, it seems that only
>>>>>> one is
>>>>>> getting fully loaded. Also, of the 5 erlang processes I can see
>>>>>> running, only one of them seems to be getting most of the cpu
>>>>>> time. I
>>>>>> know that erlang is built with smp enabled, so if it is cpu
>>>>>> bound, why
>>>>>> can't it make use of the other 3 processors?
>>>>>>
>>>>>> I thought that perhaps there was some internal write lock issue
>>>>>> per
>>>>>> database that allowed only one thread to write to a db at a
>>>>>> time, so I
>>>>>> tried running the benchmarks while hitting multiple databases,
>>>>>> but
>>>>>> still
>>>>>> got the same write rate across the databases. Is there some
>>>>>> globally
>>>>>> shared resource in couchdb that limits all writes to a single
>>>>>> thread?
>>>>>>
>>>>>> Thanks,
>>>>>> Josh
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> Before we can help you diagnose the performance you're seeing,
>>>>> could
>>>>> you tell us the version of CouchDB and the version of Erlang
>>>>> that you
>>>>> are using? It wouldn't hurt to describe the hardware in more
>>>>> detail
>>>>> either.
>>>>>
>>>>>
>>>>>
>>>>
>>>> I am seeing similar results on two systems.
>>>>
>>>> System 1:
>>>> Quad core Intel(R) Xeon(R) CPU 5160 @ 3.00GHz
>>>> 2 GB ram
>>>> Linux 2.6.18-4 -- Debian Lenny
>>>> Erlang (BEAM) emulator version 5.6.3 [source] [64-bit] [smp:4]
>>>> [async-threads:0] [kernel-poll:false]
>>>> couchdb - Apache CouchDB 0.8.0-incubating
>>>>
>>>> System 2:
>>>> Intel(R) Pentium(R) D CPU 3.00GHz
>>>> 3 GB ram
>>>> Erlang (BEAM) emulator version 5.5.5 [source] [async-threads:0]
>>>> [kernel-poll:false]
>>>> couchdb - Apache CouchDB 0.9.0a724455-incubating
>>>>
>>>> Thanks
>>>>
>>>>
>>>>
>>>>
>>
>
|