incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From CGS <cgsmcml...@gmail.com>
Subject Re: couchdb returning empty response
Date Sun, 19 Aug 2012 20:48:27 GMT
couchdb -k = kill couch and restart it



On Sun, Aug 19, 2012 at 10:00 PM, Tim Tisdall <tisdall@gmail.com> wrote:

> stderr shows this when I hit an empty response:
>
> heart_beat_kill_pid = 17700
> heart_beat_timeout = 11
> Killed
> heart: Sun Aug 19 18:23:54 2012: Erlang has closed.
> heart: Sun Aug 19 18:23:55 2012: Executed "/usr/local/bin/couchdb -k".
> Terminating.
> heart_beat_kill_pid = 18390
> heart_beat_timeout = 11
> Killed
> heart: Sun Aug 19 18:35:18 2012: Erlang has closed.
> heart: Sun Aug 19 18:35:18 2012: Executed "/usr/local/bin/couchdb -k".
> Terminating.
> heart_beat_kill_pid = 18438
> heart_beat_timeout = 11
>
>
> So, it looks like the OS is killing the process because it's running
> out of memory.  I can see in syslog that the oom-killer is killing
> processes at exactly the same time.  What's strange, though, is
> there's no mention of oom-killer killing couchdb.  There's only
> mentions of other processes being killed.
>
>
> On Sun, Aug 19, 2012 at 8:15 AM, Robert Newson <rnewson@apache.org> wrote:
> > 3.9Mb isn't large enough to trigger memory issues on its own on a node
> with 380M of ram. Can you use 'top' or 'atop' to see what memory
> consumption was like before the crash? Erlang/OTP does usually report out
> of memory errors when it crashes (to stderr which doesn't hit the .log
> file, iirc).
> >
> > B.
> >
> >
> > On 19 Aug 2012, at 11:30, CGS wrote:
> >
> >> On Sat, Aug 18, 2012 at 9:15 PM, Tim Tisdall <tisdall@gmail.com> wrote:
> >>
> >>> So, it's possible that couchdb is running out of memory when
> >>> processing a large JSON file?
> >>
> >>
> >> Definitely.
> >>
> >>
> >>> From my last example I gave, the JSON
> >>> file is 3.9Mb which I didn't think was too big, but I do only have
> >>> ~380Mb of RAM.  However, I am able to do several thousand similar
> >>> _bulk_doc updates of around the same size before I see the error...
> >>> are memory leaks possible with erlang?
> >>
> >>
> >> It looks more like a RAM limitation per process. There may be a memory
> >> leak, but I am not sure.
> >>
> >>
> >>> Also, why is there nothing in
> >>> the logs about running out of memory?  (shouldn't that be something
> >>> the program is able to detect?)
> >>>
> >>
> >> It seems CouchDB doesn't catch this type of warnings.
> >>
> >>
> >>>
> >>> I switched over to using _bulk_doc's because the database grew way too
> >>> fast if I did only 1 update at a time.  I'm doing about 5000 - 200000
> >>> document updates each time I run my script so I've been doing the
> >>> updates in batches of 150.
> >>>
> >>
> >> I don't know about your requirements, but I remember a project in which
> I
> >> created a round-robin to buffer and feed the docs to CouchDB. In that
> >> project I had to find an optimization in between the number of slices
> and
> >> the number of docs I could store for being able to feed to CouchDB in
> order
> >> to minimize the insertion time. Maybe this idea will help you in your
> >> project as well.
> >>
> >> CGS
> >>
> >>
> >>
> >>>
> >>> -Tim
> >>>
> >>> On Fri, Aug 17, 2012 at 9:33 PM, CGS <cgsmcmlxxv@gmail.com> wrote:
> >>>> I managed to reproduce the error:
> >>>>
> >>>> [Sat, 18 Aug 2012 00:57:38 GMT] [debug] [<0.170.0>] OAuth Params:
[]
> >>>> [Sat, 18 Aug 2012 00:58:37 GMT] [debug] [<0.114.0>] Include Doc:
> >>>> <<"_design/_replicator">> {1,
> >>>>
> >>> <<91,250,44,153,
> >>>>
> >>> 238,254,43,46,
> >>>>
> >>>> 180,150,45,181,
> >>>>
> >>>> 10,163,207,212>>}
> >>>> [Sat, 18 Aug 2012 00:58:37 GMT] [info] [<0.32.0>] Apache CouchDB
has
> >>>> started on http://0.0.0.0:5984/
> >>>>
> >>>> ...and I think I identified also the problem: too long/large JSON.
> >>>>
> >>>> Here is how to reproduce the error:
> >>>>
> >>>> 1. CouchDB error level: debug
> >>>> 2. an extra-huge JSON file: echo -n "{\"docs\":[{\"key\":\"1\"}" >
> >>>> my_json.json && for var in $(seq 2 2000000) ; do echo -n
> >>>> ",{\"key\":\"${var}\"}" >> my_json.json ; done && echo
-n "]}" >>
> >>>> my_json.json
> >>>> 3. attempting to send it with curl (requires to have database "test"
> >>>> already existing and preferably empty):
> >>>>
> >>>> curl -X POST http://127.0.0.7:5984/test/_bulk_docs -H 'Content-Type:
> >>>> application/json' -d @my_json.json > /dev/null
> >>>>  % Total    % Received % Xferd  Average Speed   Time    Time     Time
> >>>> Current
> >>>>                                 Dload  Upload   Total   Spent    Left
> >>>> Speed
> >>>> 100 33.2M    0     0  100 33.2M      0   856k  0:00:39  0:00:39
> --:--:--
> >>>>  0
> >>>> curl: (52) Empty reply from server
> >>>>
> >>>> Erlang shell report for the same problem:
> >>>>
> >>>> =INFO REPORT==== 18-Aug-2012::03:12:57 ===
> >>>>    alarm_handler: {set,{system_memory_high_watermark,[]}}
> >>>>
> >>>> =INFO REPORT==== 18-Aug-2012::03:12:57 ===
> >>>>    alarm_handler: {set,{process_memory_high_watermark,<0.149.0>}}
> >>>> /usr/local/lib/erlang/lib/os_mon-2.2.9/priv/bin/memsup: Erlang has
> >>>> closed.Erlang has closed
> >>>>
> >>>> Tim, try to split your JSON in smaller pieces. Bulk operations tend
to
> >>> use
> >>>> a lot of memory.
> >>>>
> >>>> The _design/_replicator error comes with multipart file set by cURL
by
> >>>> default in such cases. Once a second piece is sent toward the server,
> the
> >>>> crash is registered. The first piece report looks like:
> >>>>
> >>>> [Sat, 18 Aug 2012 00:57:38 GMT] [debug] [<0.170.0>] 'POST'
> >>> /test/_bulk_docs
> >>>> {1,1} from "127.0.0.1"
> >>>>
> >>>> I hope this info may help.
> >>>>
> >>>> CGS
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Fri, Aug 17, 2012 at 7:30 PM, Tim Tisdall <tisdall@gmail.com>
> wrote:
> >>>>
> >>>>> Okay, so it always states that _replicator line any time I manually
> >>>>> restart the server.  I think it's just a standard logging message
> when
> >>>>> the level is set to "debug".
> >>>>>
> >>>>> On Fri, Aug 17, 2012 at 1:13 PM, Tim Tisdall <tisdall@gmail.com>
> wrote:
> >>>>>> No.  All my ids (except for design documents) are strings containing
> >>>>>> integers.  Also, none of my design documents are called anything
> like
> >>>>>> "_replicator".  The only thing with that name is in the _replicator
> >>>>>> database which I'm not doing anything with.
> >>>>>>
> >>>>>> Why does it say "Include Doc"?  And what's that series of numbers
> >>>>>> afterwards?  That log message seems to consistently occur just
> before
> >>>>>> the log message about the server starting.  Is that just a normal
> >>>>>> message you get when the server restarts and you have logging
set to
> >>>>>> "debug"?
> >>>>>>
> >>>>>>
> >>>>>> On Fri, Aug 17, 2012 at 1:03 PM, Robert Newson <rnewson@apache.org>
> >>>>> wrote:
> >>>>>>>
> >>>>>>> Does app_stats_test contain a document called _design/_replicator
> or
> >>> is
> >>>>> a document with that id in the body of your bulk post?
> >>>>>>>
> >>>>>>> B.
> >>>>>>>
> >>>>>>> On 17 Aug 2012, at 17:52, Tim Tisdall wrote:
> >>>>>>>
> >>>>>>>> I do have UTF8 characters in the JSON, but isn't that
acceptable?
>  I
> >>>>>>>> have no problem retrieving UTF8 encoded content from
the server
> and
> >>> I
> >>>>>>>> have a bunch of it saved in there already too.
> >>>>>>>>
> >>>>>>>> On Fri, Aug 17, 2012 at 10:35 AM, CGS <cgsmcmlxxv@gmail.com>
> wrote:
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>> Do you have somehow special characters (non-latin1
ones) in your
> >>>>> JSON? That
> >>>>>>>>> error looks strangely close to trying to transform
a list of
> >>> unicode
> >>>>>>>>> characters into a binary. I might be wrong though.
> >>>>>>>>>
> >>>>>>>>> CGS
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Fri, Aug 17, 2012 at 4:09 PM, Tim Tisdall <tisdall@gmail.com>
> >>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> I thought I added that to the init script before
when you
> >>> mentioned
> >>>>>>>>>> it, but I checked and it was gone.  I added
a "cd ~couchdb" in
> >>> there
> >>>>>>>>>> and now I no longer get eaccess errors, but
the process still
> >>> crashes
> >>>>>>>>>> with very little information:
> >>>>>>>>>>
> >>>>>>>>>> [Fri, 17 Aug 2012 14:01:44 GMT] [debug] [<0.1372.0>]
'POST'
> >>>>>>>>>> /app_stats_test/_bulk_docs {1,0} from "127.0.0.1"
> >>>>>>>>>> Headers: [{'Accept',"*/*"},
> >>>>>>>>>>         {'Content-Length',"3902444"},
> >>>>>>>>>>         {'Content-Type',"application/json"},
> >>>>>>>>>>         {'Host',"localhost:5984"}]
> >>>>>>>>>> [Fri, 17 Aug 2012 14:01:44 GMT] [debug] [<0.1372.0>]
OAuth
> >>> Params: []
> >>>>>>>>>> [Fri, 17 Aug 2012 14:02:16 GMT] [debug] [<0.115.0>]
Include Doc:
> >>>>>>>>>> <<"_design/_replicator">> {1,
> >>>>>>>>>>
> >>>>>>>>>> <<91,250,44,153,
> >>>>>>>>>>
> >>>>>>>>>> 238,254,43,46,
> >>>>>>>>>>
> >>>>>>>>>> 180,150,45,181,
> >>>>>>>>>>
> >>>>>>>>>> 10,163,207,212>>}
> >>>>>>>>>> [Fri, 17 Aug 2012 14:02:17 GMT] [info] [<0.32.0>]
Apache CouchDB
> >>> has
> >>>>>>>>>> started on http://127.0.0.1:5984/
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Someone mentioned seeing the JSON that I'm submitting...
>  Wouldn't
> >>>>>>>>>> mal-formed JSON throw an error?
> >>>>>>>>>>
> >>>>>>>>>> -Tim
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Fri, Aug 17, 2012 at 4:33 AM, Robert Newson
<
> >>> rnewson@apache.org>
> >>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> I've seen couchdb start despite the eacces
errors before and
> >>>>> tracked it
> >>>>>>>>>> down to the current working directory setting.
It seems that the
> >>> cwd
> >>>>> is
> >>>>>>>>>> searched first, and then erlang looks elsewhere.
So, if our
> >>> startup
> >>>>> script
> >>>>>>>>>> doesn't change it to somewhere that the couchdb
user can read,
> you
> >>>>> get
> >>>>>>>>>> spurious eacces errors.
> >>>>>>>>>>>
> >>>>>>>>>>> Don't ask me how I know this.
> >>>>>>>>>>>
> >>>>>>>>>>> B.
> >>>>>>>>>>>
> >>>>>>>>>>> On 16 Aug 2012, at 20:19, Tim Tisdall wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Paul, did you ever solve the eaccess
problem you had described
> >>>>> here:
> >>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>
> >>>
> http://mail-archives.apache.org/mod_mbox/couchdb-user/201106.mbox/%3C4E0B304F.5080109@lymegreen.co.uk%3E
> >>>>>>>>>>>> I found that post from doing Google
searches for my issue.
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Tue, Aug 14, 2012 at 11:41 PM, Paul
Davis
> >>>>>>>>>>>> <paul.joseph.davis@gmail.com>
wrote:
> >>>>>>>>>>>>> On Tue, Aug 14, 2012 at 9:38 PM,
Tim Tisdall <
> >>> tisdall@gmail.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>> I'm still having problems with
couchdb, but I'm trying out
> >>>>> different
> >>>>>>>>>>>>>> things to see if I can narrow
down what the problem is...
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I stopped using fsockopen()
in PHP and am using curl now to
> >>>>> hopefully
> >>>>>>>>>>>>>> be able to see more debugging
info.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I get an empty response when
sending a POST to _bulk_docs.
> >>> From
> >>>>> the
> >>>>>>>>>>>>>> couch logs it seems like the
server restarts in the middle
> of
> >>>>>>>>>>>>>> processing the request.  Here's
what I have in my logs:  (I
> >>> have
> >>>>> no
> >>>>>>>>>>>>>> idea what the _replicator portion
is about there, I'm
> >>> currently
> >>>>> not
> >>>>>>>>>>>>>> using it)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> [Wed, 15 Aug 2012 02:27:30 GMT]
[debug] [<0.1255.0>] 'POST'
> >>>>>>>>>>>>>> /app_stats_test/_bulk_docs {1,0}
from "127.0.0.1"
> >>>>>>>>>>>>>> Headers: [{'Accept',"*/*"},
> >>>>>>>>>>>>>>        {'Content-Length',"2802300"},
> >>>>>>>>>>>>>>        {'Content-Type',"application/json"},
> >>>>>>>>>>>>>>        {'Host',"localhost:5984"}]
> >>>>>>>>>>>>>> [Wed, 15 Aug 2012 02:27:30 GMT]
[debug] [<0.1255.0>] OAuth
> >>>>> Params: []
> >>>>>>>>>>>>>> [Wed, 15 Aug 2012 02:27:45 GMT]
[debug] [<0.115.0>] Include
> >>> Doc:
> >>>>>>>>>>>>>> <<"_design/_replicator">>
{1,
> >>>>>>>>>>>>>>
> >>>>>>>>>> <<91,250,44,153,
> >>>>>>>>>>>>>>
> >>>>>>>>>> 238,254,43,46,
> >>>>>>>>>>>>>>
> >>>>>>>>>> 180,150,45,181,
> >>>>>>>>>>>>>>
> >>>>>>>>>> 10,163,207,212>>}
> >>>>>>>>>>>>>> [Wed, 15 Aug 2012 02:27:45 GMT]
[info] [<0.32.0>] Apache
> >>> CouchDB
> >>>>> has
> >>>>>>>>>>>>>> started on http://127.0.0.1:5984/
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> In my code logs I have the following
by running curl in
> >>> verbose
> >>>>> mode:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> * About to connect() to localhost
port 5984 (#0)
> >>>>>>>>>>>>>> *   Trying 127.0.0.1... * connected
> >>>>>>>>>>>>>> * Connected to localhost (127.0.0.1)
port 5984 (#0)
> >>>>>>>>>>>>>>> POST /app_stats_test/_bulk_docs
HTTP/1.0
> >>>>>>>>>>>>>> Host: localhost:5984
> >>>>>>>>>>>>>> Accept: */*
> >>>>>>>>>>>>>> Content-Type: application/json
> >>>>>>>>>>>>>> Content-Length: 2802300
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> * Empty reply from server
> >>>>>>>>>>>>>> * Connection #0 to host localhost
left intact
> >>>>>>>>>>>>>> curl error: 52 : Empty reply
from server
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I also tried using HTTP/1.1
and I get an empty response
> after
> >>>>>>>>>>>>>> receiving only a "100 Continue",
but the end result appears
> >>> the
> >>>>> same.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> -Tim
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> If you have a request that triggers
this, a good way to catch
> >>> it
> >>>>> is
> >>>>>>>>>> like such:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>  $ /usr/local/bin/couchdb # or however
you start it
> >>>>>>>>>>>>>  $ ps ax | grep beam.smp # Get the
pid of couchdb
> >>>>>>>>>>>>>  $ gdb
> >>>>>>>>>>>>>     (gdb) attach $pid # Where $pid
was just found with ps.
> >>> Might
> >>>>>>>>>>>>> throw up an access prompt
> >>>>>>>>>>>>>     (gdb) continue
> >>>>>>>>>>>>>     # At this point, run the command
that makes couchdb
> reboot
> >>>>> in a
> >>>>>>>>>>>>>     # different console. If it happens
you should see Gdb
> >>> notice
> >>>>> the
> >>>>>>>>>>>>>     # error. Then the following:
> >>>>>>>>>>>>>     (gdb) t a a bt
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> And that should spew out a bunch
of stack traces. If you can
> >>> get
> >>>>> that
> >>>>>>>>>>>>> we should be able to fairly specifically
narrow down the
> issue.
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message