incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Tisdall <tisd...@gmail.com>
Subject Re: couchdb returning empty response
Date Sat, 18 Aug 2012 19:15:14 GMT
So, it's possible that couchdb is running out of memory when
processing a large JSON file?  From my last example I gave, the JSON
file is 3.9Mb which I didn't think was too big, but I do only have
~380Mb of RAM.  However, I am able to do several thousand similar
_bulk_doc updates of around the same size before I see the error...
are memory leaks possible with erlang?  Also, why is there nothing in
the logs about running out of memory?  (shouldn't that be something
the program is able to detect?)

I switched over to using _bulk_doc's because the database grew way too
fast if I did only 1 update at a time.  I'm doing about 5000 - 200000
document updates each time I run my script so I've been doing the
updates in batches of 150.

-Tim

On Fri, Aug 17, 2012 at 9:33 PM, CGS <cgsmcmlxxv@gmail.com> wrote:
> I managed to reproduce the error:
>
> [Sat, 18 Aug 2012 00:57:38 GMT] [debug] [<0.170.0>] OAuth Params: []
> [Sat, 18 Aug 2012 00:58:37 GMT] [debug] [<0.114.0>] Include Doc:
> <<"_design/_replicator">> {1,
>                                                             <<91,250,44,153,
>                                                               238,254,43,46,
>
> 180,150,45,181,
>
> 10,163,207,212>>}
> [Sat, 18 Aug 2012 00:58:37 GMT] [info] [<0.32.0>] Apache CouchDB has
> started on http://0.0.0.0:5984/
>
> ...and I think I identified also the problem: too long/large JSON.
>
> Here is how to reproduce the error:
>
> 1. CouchDB error level: debug
> 2. an extra-huge JSON file: echo -n "{\"docs\":[{\"key\":\"1\"}" >
> my_json.json && for var in $(seq 2 2000000) ; do echo -n
> ",{\"key\":\"${var}\"}" >> my_json.json ; done && echo -n "]}" >>
> my_json.json
> 3. attempting to send it with curl (requires to have database "test"
> already existing and preferably empty):
>
> curl -X POST http://127.0.0.7:5984/test/_bulk_docs -H 'Content-Type:
> application/json' -d @my_json.json > /dev/null
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time
>  Current
>                                  Dload  Upload   Total   Spent    Left
>  Speed
> 100 33.2M    0     0  100 33.2M      0   856k  0:00:39  0:00:39 --:--:--
>   0
> curl: (52) Empty reply from server
>
> Erlang shell report for the same problem:
>
> =INFO REPORT==== 18-Aug-2012::03:12:57 ===
>     alarm_handler: {set,{system_memory_high_watermark,[]}}
>
> =INFO REPORT==== 18-Aug-2012::03:12:57 ===
>     alarm_handler: {set,{process_memory_high_watermark,<0.149.0>}}
> /usr/local/lib/erlang/lib/os_mon-2.2.9/priv/bin/memsup: Erlang has
> closed.Erlang has closed
>
> Tim, try to split your JSON in smaller pieces. Bulk operations tend to use
> a lot of memory.
>
> The _design/_replicator error comes with multipart file set by cURL by
> default in such cases. Once a second piece is sent toward the server, the
> crash is registered. The first piece report looks like:
>
> [Sat, 18 Aug 2012 00:57:38 GMT] [debug] [<0.170.0>] 'POST' /test/_bulk_docs
> {1,1} from "127.0.0.1"
>
> I hope this info may help.
>
> CGS
>
>
>
>
>
>
> On Fri, Aug 17, 2012 at 7:30 PM, Tim Tisdall <tisdall@gmail.com> wrote:
>
>> Okay, so it always states that _replicator line any time I manually
>> restart the server.  I think it's just a standard logging message when
>> the level is set to "debug".
>>
>> On Fri, Aug 17, 2012 at 1:13 PM, Tim Tisdall <tisdall@gmail.com> wrote:
>> > No.  All my ids (except for design documents) are strings containing
>> > integers.  Also, none of my design documents are called anything like
>> > "_replicator".  The only thing with that name is in the _replicator
>> > database which I'm not doing anything with.
>> >
>> > Why does it say "Include Doc"?  And what's that series of numbers
>> > afterwards?  That log message seems to consistently occur just before
>> > the log message about the server starting.  Is that just a normal
>> > message you get when the server restarts and you have logging set to
>> > "debug"?
>> >
>> >
>> > On Fri, Aug 17, 2012 at 1:03 PM, Robert Newson <rnewson@apache.org>
>> wrote:
>> >>
>> >> Does app_stats_test contain a document called _design/_replicator or is
>> a document with that id in the body of your bulk post?
>> >>
>> >> B.
>> >>
>> >> On 17 Aug 2012, at 17:52, Tim Tisdall wrote:
>> >>
>> >>> I do have UTF8 characters in the JSON, but isn't that acceptable?  I
>> >>> have no problem retrieving UTF8 encoded content from the server and
I
>> >>> have a bunch of it saved in there already too.
>> >>>
>> >>> On Fri, Aug 17, 2012 at 10:35 AM, CGS <cgsmcmlxxv@gmail.com> wrote:
>> >>>> Hi,
>> >>>>
>> >>>> Do you have somehow special characters (non-latin1 ones) in your
>> JSON? That
>> >>>> error looks strangely close to trying to transform a list of unicode
>> >>>> characters into a binary. I might be wrong though.
>> >>>>
>> >>>> CGS
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Fri, Aug 17, 2012 at 4:09 PM, Tim Tisdall <tisdall@gmail.com>
>> wrote:
>> >>>>
>> >>>>> I thought I added that to the init script before when you mentioned
>> >>>>> it, but I checked and it was gone.  I added a "cd ~couchdb"
in there
>> >>>>> and now I no longer get eaccess errors, but the process still
crashes
>> >>>>> with very little information:
>> >>>>>
>> >>>>> [Fri, 17 Aug 2012 14:01:44 GMT] [debug] [<0.1372.0>] 'POST'
>> >>>>> /app_stats_test/_bulk_docs {1,0} from "127.0.0.1"
>> >>>>> Headers: [{'Accept',"*/*"},
>> >>>>>          {'Content-Length',"3902444"},
>> >>>>>          {'Content-Type',"application/json"},
>> >>>>>          {'Host',"localhost:5984"}]
>> >>>>> [Fri, 17 Aug 2012 14:01:44 GMT] [debug] [<0.1372.0>] OAuth
Params: []
>> >>>>> [Fri, 17 Aug 2012 14:02:16 GMT] [debug] [<0.115.0>] Include
Doc:
>> >>>>> <<"_design/_replicator">> {1,
>> >>>>>
>> >>>>> <<91,250,44,153,
>> >>>>>
>> >>>>> 238,254,43,46,
>> >>>>>
>> >>>>> 180,150,45,181,
>> >>>>>
>> >>>>> 10,163,207,212>>}
>> >>>>> [Fri, 17 Aug 2012 14:02:17 GMT] [info] [<0.32.0>] Apache
CouchDB has
>> >>>>> started on http://127.0.0.1:5984/
>> >>>>>
>> >>>>>
>> >>>>> Someone mentioned seeing the JSON that I'm submitting...  Wouldn't
>> >>>>> mal-formed JSON throw an error?
>> >>>>>
>> >>>>> -Tim
>> >>>>>
>> >>>>>
>> >>>>> On Fri, Aug 17, 2012 at 4:33 AM, Robert Newson <rnewson@apache.org>
>> wrote:
>> >>>>>>
>> >>>>>> I've seen couchdb start despite the eacces errors before
and
>> tracked it
>> >>>>> down to the current working directory setting. It seems that
the cwd
>> is
>> >>>>> searched first, and then erlang looks elsewhere. So, if our
startup
>> script
>> >>>>> doesn't change it to somewhere that the couchdb user can read,
you
>> get
>> >>>>> spurious eacces errors.
>> >>>>>>
>> >>>>>> Don't ask me how I know this.
>> >>>>>>
>> >>>>>> B.
>> >>>>>>
>> >>>>>> On 16 Aug 2012, at 20:19, Tim Tisdall wrote:
>> >>>>>>
>> >>>>>>> Paul, did you ever solve the eaccess problem you had
described
>> here:
>> >>>>>>>
>> >>>>>
>> http://mail-archives.apache.org/mod_mbox/couchdb-user/201106.mbox/%3C4E0B304F.5080109@lymegreen.co.uk%3E
>> >>>>>>> I found that post from doing Google searches for my
issue.
>> >>>>>>>
>> >>>>>>> On Tue, Aug 14, 2012 at 11:41 PM, Paul Davis
>> >>>>>>> <paul.joseph.davis@gmail.com> wrote:
>> >>>>>>>> On Tue, Aug 14, 2012 at 9:38 PM, Tim Tisdall <tisdall@gmail.com>
>> >>>>> wrote:
>> >>>>>>>>> I'm still having problems with couchdb, but
I'm trying out
>> different
>> >>>>>>>>> things to see if I can narrow down what the
problem is...
>> >>>>>>>>>
>> >>>>>>>>> I stopped using fsockopen() in PHP and am using
curl now to
>> hopefully
>> >>>>>>>>> be able to see more debugging info.
>> >>>>>>>>>
>> >>>>>>>>> I get an empty response when sending a POST
to _bulk_docs.  From
>> the
>> >>>>>>>>> couch logs it seems like the server restarts
in the middle of
>> >>>>>>>>> processing the request.  Here's what I have
in my logs:  (I have
>> no
>> >>>>>>>>> idea what the _replicator portion is about there,
I'm currently
>> not
>> >>>>>>>>> using it)
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> [Wed, 15 Aug 2012 02:27:30 GMT] [debug] [<0.1255.0>]
'POST'
>> >>>>>>>>> /app_stats_test/_bulk_docs {1,0} from "127.0.0.1"
>> >>>>>>>>> Headers: [{'Accept',"*/*"},
>> >>>>>>>>>         {'Content-Length',"2802300"},
>> >>>>>>>>>         {'Content-Type',"application/json"},
>> >>>>>>>>>         {'Host',"localhost:5984"}]
>> >>>>>>>>> [Wed, 15 Aug 2012 02:27:30 GMT] [debug] [<0.1255.0>]
OAuth
>> Params: []
>> >>>>>>>>> [Wed, 15 Aug 2012 02:27:45 GMT] [debug] [<0.115.0>]
Include Doc:
>> >>>>>>>>> <<"_design/_replicator">> {1,
>> >>>>>>>>>
>> >>>>> <<91,250,44,153,
>> >>>>>>>>>
>> >>>>> 238,254,43,46,
>> >>>>>>>>>
>> >>>>> 180,150,45,181,
>> >>>>>>>>>
>> >>>>> 10,163,207,212>>}
>> >>>>>>>>> [Wed, 15 Aug 2012 02:27:45 GMT] [info] [<0.32.0>]
Apache CouchDB
>> has
>> >>>>>>>>> started on http://127.0.0.1:5984/
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> In my code logs I have the following by running
curl in verbose
>> mode:
>> >>>>>>>>>
>> >>>>>>>>> * About to connect() to localhost port 5984
(#0)
>> >>>>>>>>> *   Trying 127.0.0.1... * connected
>> >>>>>>>>> * Connected to localhost (127.0.0.1) port 5984
(#0)
>> >>>>>>>>>> POST /app_stats_test/_bulk_docs HTTP/1.0
>> >>>>>>>>> Host: localhost:5984
>> >>>>>>>>> Accept: */*
>> >>>>>>>>> Content-Type: application/json
>> >>>>>>>>> Content-Length: 2802300
>> >>>>>>>>>
>> >>>>>>>>> * Empty reply from server
>> >>>>>>>>> * Connection #0 to host localhost left intact
>> >>>>>>>>> curl error: 52 : Empty reply from server
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> I also tried using HTTP/1.1 and I get an empty
response after
>> >>>>>>>>> receiving only a "100 Continue", but the end
result appears the
>> same.
>> >>>>>>>>>
>> >>>>>>>>> -Tim
>> >>>>>>>>
>> >>>>>>>> If you have a request that triggers this, a good
way to catch it
>> is
>> >>>>> like such:
>> >>>>>>>>
>> >>>>>>>>   $ /usr/local/bin/couchdb # or however you start
it
>> >>>>>>>>   $ ps ax | grep beam.smp # Get the pid of couchdb
>> >>>>>>>>   $ gdb
>> >>>>>>>>      (gdb) attach $pid # Where $pid was just found
with ps. Might
>> >>>>>>>> throw up an access prompt
>> >>>>>>>>      (gdb) continue
>> >>>>>>>>      # At this point, run the command that makes
couchdb reboot
>> in a
>> >>>>>>>>      # different console. If it happens you should
see Gdb notice
>> the
>> >>>>>>>>      # error. Then the following:
>> >>>>>>>>      (gdb) t a a bt
>> >>>>>>>>
>> >>>>>>>> And that should spew out a bunch of stack traces.
If you can get
>> that
>> >>>>>>>> we should be able to fairly specifically narrow
down the issue.
>> >>>>>>
>> >>>>>
>> >>
>>

Mime
View raw message