incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <rnew...@apache.org>
Subject Re: couchdb returning empty response
Date Sun, 19 Aug 2012 12:15:40 GMT
3.9Mb isn't large enough to trigger memory issues on its own on a node with 380M of ram. Can
you use 'top' or 'atop' to see what memory consumption was like before the crash? Erlang/OTP
does usually report out of memory errors when it crashes (to stderr which doesn't hit the
.log file, iirc).

B.


On 19 Aug 2012, at 11:30, CGS wrote:

> On Sat, Aug 18, 2012 at 9:15 PM, Tim Tisdall <tisdall@gmail.com> wrote:
> 
>> So, it's possible that couchdb is running out of memory when
>> processing a large JSON file?
> 
> 
> Definitely.
> 
> 
>> From my last example I gave, the JSON
>> file is 3.9Mb which I didn't think was too big, but I do only have
>> ~380Mb of RAM.  However, I am able to do several thousand similar
>> _bulk_doc updates of around the same size before I see the error...
>> are memory leaks possible with erlang?
> 
> 
> It looks more like a RAM limitation per process. There may be a memory
> leak, but I am not sure.
> 
> 
>> Also, why is there nothing in
>> the logs about running out of memory?  (shouldn't that be something
>> the program is able to detect?)
>> 
> 
> It seems CouchDB doesn't catch this type of warnings.
> 
> 
>> 
>> I switched over to using _bulk_doc's because the database grew way too
>> fast if I did only 1 update at a time.  I'm doing about 5000 - 200000
>> document updates each time I run my script so I've been doing the
>> updates in batches of 150.
>> 
> 
> I don't know about your requirements, but I remember a project in which I
> created a round-robin to buffer and feed the docs to CouchDB. In that
> project I had to find an optimization in between the number of slices and
> the number of docs I could store for being able to feed to CouchDB in order
> to minimize the insertion time. Maybe this idea will help you in your
> project as well.
> 
> CGS
> 
> 
> 
>> 
>> -Tim
>> 
>> On Fri, Aug 17, 2012 at 9:33 PM, CGS <cgsmcmlxxv@gmail.com> wrote:
>>> I managed to reproduce the error:
>>> 
>>> [Sat, 18 Aug 2012 00:57:38 GMT] [debug] [<0.170.0>] OAuth Params: []
>>> [Sat, 18 Aug 2012 00:58:37 GMT] [debug] [<0.114.0>] Include Doc:
>>> <<"_design/_replicator">> {1,
>>> 
>> <<91,250,44,153,
>>> 
>> 238,254,43,46,
>>> 
>>> 180,150,45,181,
>>> 
>>> 10,163,207,212>>}
>>> [Sat, 18 Aug 2012 00:58:37 GMT] [info] [<0.32.0>] Apache CouchDB has
>>> started on http://0.0.0.0:5984/
>>> 
>>> ...and I think I identified also the problem: too long/large JSON.
>>> 
>>> Here is how to reproduce the error:
>>> 
>>> 1. CouchDB error level: debug
>>> 2. an extra-huge JSON file: echo -n "{\"docs\":[{\"key\":\"1\"}" >
>>> my_json.json && for var in $(seq 2 2000000) ; do echo -n
>>> ",{\"key\":\"${var}\"}" >> my_json.json ; done && echo -n "]}"
>>
>>> my_json.json
>>> 3. attempting to send it with curl (requires to have database "test"
>>> already existing and preferably empty):
>>> 
>>> curl -X POST http://127.0.0.7:5984/test/_bulk_docs -H 'Content-Type:
>>> application/json' -d @my_json.json > /dev/null
>>>  % Total    % Received % Xferd  Average Speed   Time    Time     Time
>>> Current
>>>                                 Dload  Upload   Total   Spent    Left
>>> Speed
>>> 100 33.2M    0     0  100 33.2M      0   856k  0:00:39  0:00:39 --:--:--
>>>  0
>>> curl: (52) Empty reply from server
>>> 
>>> Erlang shell report for the same problem:
>>> 
>>> =INFO REPORT==== 18-Aug-2012::03:12:57 ===
>>>    alarm_handler: {set,{system_memory_high_watermark,[]}}
>>> 
>>> =INFO REPORT==== 18-Aug-2012::03:12:57 ===
>>>    alarm_handler: {set,{process_memory_high_watermark,<0.149.0>}}
>>> /usr/local/lib/erlang/lib/os_mon-2.2.9/priv/bin/memsup: Erlang has
>>> closed.Erlang has closed
>>> 
>>> Tim, try to split your JSON in smaller pieces. Bulk operations tend to
>> use
>>> a lot of memory.
>>> 
>>> The _design/_replicator error comes with multipart file set by cURL by
>>> default in such cases. Once a second piece is sent toward the server, the
>>> crash is registered. The first piece report looks like:
>>> 
>>> [Sat, 18 Aug 2012 00:57:38 GMT] [debug] [<0.170.0>] 'POST'
>> /test/_bulk_docs
>>> {1,1} from "127.0.0.1"
>>> 
>>> I hope this info may help.
>>> 
>>> CGS
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Fri, Aug 17, 2012 at 7:30 PM, Tim Tisdall <tisdall@gmail.com> wrote:
>>> 
>>>> Okay, so it always states that _replicator line any time I manually
>>>> restart the server.  I think it's just a standard logging message when
>>>> the level is set to "debug".
>>>> 
>>>> On Fri, Aug 17, 2012 at 1:13 PM, Tim Tisdall <tisdall@gmail.com> wrote:
>>>>> No.  All my ids (except for design documents) are strings containing
>>>>> integers.  Also, none of my design documents are called anything like
>>>>> "_replicator".  The only thing with that name is in the _replicator
>>>>> database which I'm not doing anything with.
>>>>> 
>>>>> Why does it say "Include Doc"?  And what's that series of numbers
>>>>> afterwards?  That log message seems to consistently occur just before
>>>>> the log message about the server starting.  Is that just a normal
>>>>> message you get when the server restarts and you have logging set to
>>>>> "debug"?
>>>>> 
>>>>> 
>>>>> On Fri, Aug 17, 2012 at 1:03 PM, Robert Newson <rnewson@apache.org>
>>>> wrote:
>>>>>> 
>>>>>> Does app_stats_test contain a document called _design/_replicator
or
>> is
>>>> a document with that id in the body of your bulk post?
>>>>>> 
>>>>>> B.
>>>>>> 
>>>>>> On 17 Aug 2012, at 17:52, Tim Tisdall wrote:
>>>>>> 
>>>>>>> I do have UTF8 characters in the JSON, but isn't that acceptable?
 I
>>>>>>> have no problem retrieving UTF8 encoded content from the server
and
>> I
>>>>>>> have a bunch of it saved in there already too.
>>>>>>> 
>>>>>>> On Fri, Aug 17, 2012 at 10:35 AM, CGS <cgsmcmlxxv@gmail.com>
wrote:
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> Do you have somehow special characters (non-latin1 ones)
in your
>>>> JSON? That
>>>>>>>> error looks strangely close to trying to transform a list
of
>> unicode
>>>>>>>> characters into a binary. I might be wrong though.
>>>>>>>> 
>>>>>>>> CGS
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Fri, Aug 17, 2012 at 4:09 PM, Tim Tisdall <tisdall@gmail.com>
>>>> wrote:
>>>>>>>> 
>>>>>>>>> I thought I added that to the init script before when
you
>> mentioned
>>>>>>>>> it, but I checked and it was gone.  I added a "cd ~couchdb"
in
>> there
>>>>>>>>> and now I no longer get eaccess errors, but the process
still
>> crashes
>>>>>>>>> with very little information:
>>>>>>>>> 
>>>>>>>>> [Fri, 17 Aug 2012 14:01:44 GMT] [debug] [<0.1372.0>]
'POST'
>>>>>>>>> /app_stats_test/_bulk_docs {1,0} from "127.0.0.1"
>>>>>>>>> Headers: [{'Accept',"*/*"},
>>>>>>>>>         {'Content-Length',"3902444"},
>>>>>>>>>         {'Content-Type',"application/json"},
>>>>>>>>>         {'Host',"localhost:5984"}]
>>>>>>>>> [Fri, 17 Aug 2012 14:01:44 GMT] [debug] [<0.1372.0>]
OAuth
>> Params: []
>>>>>>>>> [Fri, 17 Aug 2012 14:02:16 GMT] [debug] [<0.115.0>]
Include Doc:
>>>>>>>>> <<"_design/_replicator">> {1,
>>>>>>>>> 
>>>>>>>>> <<91,250,44,153,
>>>>>>>>> 
>>>>>>>>> 238,254,43,46,
>>>>>>>>> 
>>>>>>>>> 180,150,45,181,
>>>>>>>>> 
>>>>>>>>> 10,163,207,212>>}
>>>>>>>>> [Fri, 17 Aug 2012 14:02:17 GMT] [info] [<0.32.0>]
Apache CouchDB
>> has
>>>>>>>>> started on http://127.0.0.1:5984/
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Someone mentioned seeing the JSON that I'm submitting...
 Wouldn't
>>>>>>>>> mal-formed JSON throw an error?
>>>>>>>>> 
>>>>>>>>> -Tim
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Fri, Aug 17, 2012 at 4:33 AM, Robert Newson <
>> rnewson@apache.org>
>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> I've seen couchdb start despite the eacces errors
before and
>>>> tracked it
>>>>>>>>> down to the current working directory setting. It seems
that the
>> cwd
>>>> is
>>>>>>>>> searched first, and then erlang looks elsewhere. So,
if our
>> startup
>>>> script
>>>>>>>>> doesn't change it to somewhere that the couchdb user
can read, you
>>>> get
>>>>>>>>> spurious eacces errors.
>>>>>>>>>> 
>>>>>>>>>> Don't ask me how I know this.
>>>>>>>>>> 
>>>>>>>>>> B.
>>>>>>>>>> 
>>>>>>>>>> On 16 Aug 2012, at 20:19, Tim Tisdall wrote:
>>>>>>>>>> 
>>>>>>>>>>> Paul, did you ever solve the eaccess problem
you had described
>>>> here:
>>>>>>>>>>> 
>>>>>>>>> 
>>>> 
>> http://mail-archives.apache.org/mod_mbox/couchdb-user/201106.mbox/%3C4E0B304F.5080109@lymegreen.co.uk%3E
>>>>>>>>>>> I found that post from doing Google searches
for my issue.
>>>>>>>>>>> 
>>>>>>>>>>> On Tue, Aug 14, 2012 at 11:41 PM, Paul Davis
>>>>>>>>>>> <paul.joseph.davis@gmail.com> wrote:
>>>>>>>>>>>> On Tue, Aug 14, 2012 at 9:38 PM, Tim Tisdall
<
>> tisdall@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>>>>> I'm still having problems with couchdb,
but I'm trying out
>>>> different
>>>>>>>>>>>>> things to see if I can narrow down what
the problem is...
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I stopped using fsockopen() in PHP and
am using curl now to
>>>> hopefully
>>>>>>>>>>>>> be able to see more debugging info.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I get an empty response when sending
a POST to _bulk_docs.
>> From
>>>> the
>>>>>>>>>>>>> couch logs it seems like the server restarts
in the middle of
>>>>>>>>>>>>> processing the request.  Here's what
I have in my logs:  (I
>> have
>>>> no
>>>>>>>>>>>>> idea what the _replicator portion is
about there, I'm
>> currently
>>>> not
>>>>>>>>>>>>> using it)
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> [Wed, 15 Aug 2012 02:27:30 GMT] [debug]
[<0.1255.0>] 'POST'
>>>>>>>>>>>>> /app_stats_test/_bulk_docs {1,0} from
"127.0.0.1"
>>>>>>>>>>>>> Headers: [{'Accept',"*/*"},
>>>>>>>>>>>>>        {'Content-Length',"2802300"},
>>>>>>>>>>>>>        {'Content-Type',"application/json"},
>>>>>>>>>>>>>        {'Host',"localhost:5984"}]
>>>>>>>>>>>>> [Wed, 15 Aug 2012 02:27:30 GMT] [debug]
[<0.1255.0>] OAuth
>>>> Params: []
>>>>>>>>>>>>> [Wed, 15 Aug 2012 02:27:45 GMT] [debug]
[<0.115.0>] Include
>> Doc:
>>>>>>>>>>>>> <<"_design/_replicator">>
{1,
>>>>>>>>>>>>> 
>>>>>>>>> <<91,250,44,153,
>>>>>>>>>>>>> 
>>>>>>>>> 238,254,43,46,
>>>>>>>>>>>>> 
>>>>>>>>> 180,150,45,181,
>>>>>>>>>>>>> 
>>>>>>>>> 10,163,207,212>>}
>>>>>>>>>>>>> [Wed, 15 Aug 2012 02:27:45 GMT] [info]
[<0.32.0>] Apache
>> CouchDB
>>>> has
>>>>>>>>>>>>> started on http://127.0.0.1:5984/
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> In my code logs I have the following
by running curl in
>> verbose
>>>> mode:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> * About to connect() to localhost port
5984 (#0)
>>>>>>>>>>>>> *   Trying 127.0.0.1... * connected
>>>>>>>>>>>>> * Connected to localhost (127.0.0.1)
port 5984 (#0)
>>>>>>>>>>>>>> POST /app_stats_test/_bulk_docs HTTP/1.0
>>>>>>>>>>>>> Host: localhost:5984
>>>>>>>>>>>>> Accept: */*
>>>>>>>>>>>>> Content-Type: application/json
>>>>>>>>>>>>> Content-Length: 2802300
>>>>>>>>>>>>> 
>>>>>>>>>>>>> * Empty reply from server
>>>>>>>>>>>>> * Connection #0 to host localhost left
intact
>>>>>>>>>>>>> curl error: 52 : Empty reply from server
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I also tried using HTTP/1.1 and I get
an empty response after
>>>>>>>>>>>>> receiving only a "100 Continue", but
the end result appears
>> the
>>>> same.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> -Tim
>>>>>>>>>>>> 
>>>>>>>>>>>> If you have a request that triggers this,
a good way to catch
>> it
>>>> is
>>>>>>>>> like such:
>>>>>>>>>>>> 
>>>>>>>>>>>>  $ /usr/local/bin/couchdb # or however you
start it
>>>>>>>>>>>>  $ ps ax | grep beam.smp # Get the pid of
couchdb
>>>>>>>>>>>>  $ gdb
>>>>>>>>>>>>     (gdb) attach $pid # Where $pid was just
found with ps.
>> Might
>>>>>>>>>>>> throw up an access prompt
>>>>>>>>>>>>     (gdb) continue
>>>>>>>>>>>>     # At this point, run the command that
makes couchdb reboot
>>>> in a
>>>>>>>>>>>>     # different console. If it happens you
should see Gdb
>> notice
>>>> the
>>>>>>>>>>>>     # error. Then the following:
>>>>>>>>>>>>     (gdb) t a a bt
>>>>>>>>>>>> 
>>>>>>>>>>>> And that should spew out a bunch of stack
traces. If you can
>> get
>>>> that
>>>>>>>>>>>> we should be able to fairly specifically
narrow down the issue.
>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>>> 
>> 


Mime
View raw message