couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Tisdall <tisd...@gmail.com>
Subject Re: limit on number of docs updated via _bulk_docs?
Date Tue, 05 Jun 2012 15:19:00 GMT
Okay, the change that seemed to fix things is whether or not I made
deletes to an sqlite database in transactional batches
 or just as auto-committing single calls.  When I put transactions
back in I have the fsockopen that communicates with couchdb fail after
11k GETs.  I have no idea how the one affects the other but I guess
it's something on the OS level.

-Tim

On Tue, Jun 5, 2012 at 8:43 AM, Tim Tisdall <tisdall@gmail.com> wrote:
> Um...  I seemed to have fixed the problem, but I'm not sure how.  :S
> The problem wasn't related to POSTs to _bulk_docs, though, but to
> fetching documents.  After about 11k of document GETs it would
> suddenly start slowing down drastically until they started timing
> out...  But maybe the cause had nothing to do with couchdb...  I'll
> let you guys know if I manage to figure it out.
>
> -Tim
>
> On Tue, Jun 5, 2012 at 7:55 AM, Tim Tisdall <tisdall@gmail.com> wrote:
>> Yeah, I've done other bulk updates with millions and had no problem
>> either, that's why I'm not sure what's going on here.
>>
>> I forgot to mention that I'm not running the script on the same
>> documents each time.  I have about 14 million docs and I'm processing
>> a different set each time, so it's not dying on as specific document
>> each time but after it's processed any 11k of documents.
>>
>> I'll try putting the logging on debug and see what's there... on
>> "error" there's absolutely nothing showing in there.
>>
>> -Tim
>>
>> On Tue, Jun 5, 2012 at 12:14 AM, CGS <cgsmcmlxxv@gmail.com> wrote:
>>> Hi Tim,
>>>
>>> I've been using successfully _bulk_docs insertions with millions of docs
>>> and I had no problem (well, I had a problem when I used os:cmd/1 + cURL
>>> because I exceeded the number of "allowed" lines, but except for that, I
>>> had no problem). I would suggest to check at which doc it breaks and see if
>>> there is a problem with the document format (e.g., forgot to close/open a
>>> curly bracket or a double quota, or problems from JSON format etc.) or if
>>> your HDD isn't full. If you don't find the error by looking at that
>>> document, try to isolate it and to make a single insertion with it only
>>> (you will get back an error and, at least, it will give you a hint where
>>> the problem is).
>>>
>>> That is what comes to my mind now. If I have more ideas, I will let you
>>> know.
>>>
>>> CGS
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Jun 5, 2012 at 5:07 AM, Tim Tisdall <tisdall@gmail.com> wrote:
>>>
>>>> Hopefully someone can give me an idea on this problem because I think
>>>> I've about exhausted ideas.
>>>>
>>>> I'm doing a series of document updates fairly rapidly.  I was doing
>>>> the updates via PUT and was having no problems except for the DB file
>>>> size growing way too fast.  I changed things to update the database in
>>>> batches using _bulk_docs.  Now I seem to have a problem with
>>>> connections timing out to couchdb after about 11000 doc updates.  I've
>>>> tried different size batches from 5 to 500 docs but each time the
>>>> program dies with a connection time out after about the same number of
>>>> doc updates.
>>>>
>>>> I thought it may be a problem with my code (it's in PHP, and that's
>>>> usually the problem ;) ), however I tried something that I think
>>>> negates that possibility.  I have the script running and stop it after
>>>> about 5000 updates, then manually restart the script again right away.
>>>>  The second time the script dies after about 6000 updates.  So, still
>>>> around 11000 updates over 2 different processes.
>>>>
>>>> Any thoughts, guesses, things to try, or things to test?
>>>>
>>>> -Tim
>>>>

Mime
View raw message