couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From CGS <cgsmcml...@gmail.com>
Subject Re: couchdb returning empty response
Date Sun, 19 Aug 2012 10:30:42 GMT
On Sat, Aug 18, 2012 at 9:15 PM, Tim Tisdall <tisdall@gmail.com> wrote:

> So, it's possible that couchdb is running out of memory when
> processing a large JSON file?


Definitely.


> From my last example I gave, the JSON
> file is 3.9Mb which I didn't think was too big, but I do only have
> ~380Mb of RAM.  However, I am able to do several thousand similar
> _bulk_doc updates of around the same size before I see the error...
> are memory leaks possible with erlang?


It looks more like a RAM limitation per process. There may be a memory
leak, but I am not sure.


> Also, why is there nothing in
> the logs about running out of memory?  (shouldn't that be something
> the program is able to detect?)
>

It seems CouchDB doesn't catch this type of warnings.


>
> I switched over to using _bulk_doc's because the database grew way too
> fast if I did only 1 update at a time.  I'm doing about 5000 - 200000
> document updates each time I run my script so I've been doing the
> updates in batches of 150.
>

I don't know about your requirements, but I remember a project in which I
created a round-robin to buffer and feed the docs to CouchDB. In that
project I had to find an optimization in between the number of slices and
the number of docs I could store for being able to feed to CouchDB in order
to minimize the insertion time. Maybe this idea will help you in your
project as well.

CGS



>
> -Tim
>
> On Fri, Aug 17, 2012 at 9:33 PM, CGS <cgsmcmlxxv@gmail.com> wrote:
> > I managed to reproduce the error:
> >
> > [Sat, 18 Aug 2012 00:57:38 GMT] [debug] [<0.170.0>] OAuth Params: []
> > [Sat, 18 Aug 2012 00:58:37 GMT] [debug] [<0.114.0>] Include Doc:
> > <<"_design/_replicator">> {1,
> >
> <<91,250,44,153,
> >
> 238,254,43,46,
> >
> > 180,150,45,181,
> >
> > 10,163,207,212>>}
> > [Sat, 18 Aug 2012 00:58:37 GMT] [info] [<0.32.0>] Apache CouchDB has
> > started on http://0.0.0.0:5984/
> >
> > ...and I think I identified also the problem: too long/large JSON.
> >
> > Here is how to reproduce the error:
> >
> > 1. CouchDB error level: debug
> > 2. an extra-huge JSON file: echo -n "{\"docs\":[{\"key\":\"1\"}" >
> > my_json.json && for var in $(seq 2 2000000) ; do echo -n
> > ",{\"key\":\"${var}\"}" >> my_json.json ; done && echo -n "]}" >>
> > my_json.json
> > 3. attempting to send it with curl (requires to have database "test"
> > already existing and preferably empty):
> >
> > curl -X POST http://127.0.0.7:5984/test/_bulk_docs -H 'Content-Type:
> > application/json' -d @my_json.json > /dev/null
> >   % Total    % Received % Xferd  Average Speed   Time    Time     Time
> >  Current
> >                                  Dload  Upload   Total   Spent    Left
> >  Speed
> > 100 33.2M    0     0  100 33.2M      0   856k  0:00:39  0:00:39 --:--:--
> >   0
> > curl: (52) Empty reply from server
> >
> > Erlang shell report for the same problem:
> >
> > =INFO REPORT==== 18-Aug-2012::03:12:57 ===
> >     alarm_handler: {set,{system_memory_high_watermark,[]}}
> >
> > =INFO REPORT==== 18-Aug-2012::03:12:57 ===
> >     alarm_handler: {set,{process_memory_high_watermark,<0.149.0>}}
> > /usr/local/lib/erlang/lib/os_mon-2.2.9/priv/bin/memsup: Erlang has
> > closed.Erlang has closed
> >
> > Tim, try to split your JSON in smaller pieces. Bulk operations tend to
> use
> > a lot of memory.
> >
> > The _design/_replicator error comes with multipart file set by cURL by
> > default in such cases. Once a second piece is sent toward the server, the
> > crash is registered. The first piece report looks like:
> >
> > [Sat, 18 Aug 2012 00:57:38 GMT] [debug] [<0.170.0>] 'POST'
> /test/_bulk_docs
> > {1,1} from "127.0.0.1"
> >
> > I hope this info may help.
> >
> > CGS
> >
> >
> >
> >
> >
> >
> > On Fri, Aug 17, 2012 at 7:30 PM, Tim Tisdall <tisdall@gmail.com> wrote:
> >
> >> Okay, so it always states that _replicator line any time I manually
> >> restart the server.  I think it's just a standard logging message when
> >> the level is set to "debug".
> >>
> >> On Fri, Aug 17, 2012 at 1:13 PM, Tim Tisdall <tisdall@gmail.com> wrote:
> >> > No.  All my ids (except for design documents) are strings containing
> >> > integers.  Also, none of my design documents are called anything like
> >> > "_replicator".  The only thing with that name is in the _replicator
> >> > database which I'm not doing anything with.
> >> >
> >> > Why does it say "Include Doc"?  And what's that series of numbers
> >> > afterwards?  That log message seems to consistently occur just before
> >> > the log message about the server starting.  Is that just a normal
> >> > message you get when the server restarts and you have logging set to
> >> > "debug"?
> >> >
> >> >
> >> > On Fri, Aug 17, 2012 at 1:03 PM, Robert Newson <rnewson@apache.org>
> >> wrote:
> >> >>
> >> >> Does app_stats_test contain a document called _design/_replicator or
> is
> >> a document with that id in the body of your bulk post?
> >> >>
> >> >> B.
> >> >>
> >> >> On 17 Aug 2012, at 17:52, Tim Tisdall wrote:
> >> >>
> >> >>> I do have UTF8 characters in the JSON, but isn't that acceptable?
 I
> >> >>> have no problem retrieving UTF8 encoded content from the server
and
> I
> >> >>> have a bunch of it saved in there already too.
> >> >>>
> >> >>> On Fri, Aug 17, 2012 at 10:35 AM, CGS <cgsmcmlxxv@gmail.com>
wrote:
> >> >>>> Hi,
> >> >>>>
> >> >>>> Do you have somehow special characters (non-latin1 ones) in
your
> >> JSON? That
> >> >>>> error looks strangely close to trying to transform a list of
> unicode
> >> >>>> characters into a binary. I might be wrong though.
> >> >>>>
> >> >>>> CGS
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> On Fri, Aug 17, 2012 at 4:09 PM, Tim Tisdall <tisdall@gmail.com>
> >> wrote:
> >> >>>>
> >> >>>>> I thought I added that to the init script before when you
> mentioned
> >> >>>>> it, but I checked and it was gone.  I added a "cd ~couchdb"
in
> there
> >> >>>>> and now I no longer get eaccess errors, but the process
still
> crashes
> >> >>>>> with very little information:
> >> >>>>>
> >> >>>>> [Fri, 17 Aug 2012 14:01:44 GMT] [debug] [<0.1372.0>]
'POST'
> >> >>>>> /app_stats_test/_bulk_docs {1,0} from "127.0.0.1"
> >> >>>>> Headers: [{'Accept',"*/*"},
> >> >>>>>          {'Content-Length',"3902444"},
> >> >>>>>          {'Content-Type',"application/json"},
> >> >>>>>          {'Host',"localhost:5984"}]
> >> >>>>> [Fri, 17 Aug 2012 14:01:44 GMT] [debug] [<0.1372.0>]
OAuth
> Params: []
> >> >>>>> [Fri, 17 Aug 2012 14:02:16 GMT] [debug] [<0.115.0>]
Include Doc:
> >> >>>>> <<"_design/_replicator">> {1,
> >> >>>>>
> >> >>>>> <<91,250,44,153,
> >> >>>>>
> >> >>>>> 238,254,43,46,
> >> >>>>>
> >> >>>>> 180,150,45,181,
> >> >>>>>
> >> >>>>> 10,163,207,212>>}
> >> >>>>> [Fri, 17 Aug 2012 14:02:17 GMT] [info] [<0.32.0>]
Apache CouchDB
> has
> >> >>>>> started on http://127.0.0.1:5984/
> >> >>>>>
> >> >>>>>
> >> >>>>> Someone mentioned seeing the JSON that I'm submitting...
 Wouldn't
> >> >>>>> mal-formed JSON throw an error?
> >> >>>>>
> >> >>>>> -Tim
> >> >>>>>
> >> >>>>>
> >> >>>>> On Fri, Aug 17, 2012 at 4:33 AM, Robert Newson <
> rnewson@apache.org>
> >> wrote:
> >> >>>>>>
> >> >>>>>> I've seen couchdb start despite the eacces errors before
and
> >> tracked it
> >> >>>>> down to the current working directory setting. It seems
that the
> cwd
> >> is
> >> >>>>> searched first, and then erlang looks elsewhere. So, if
our
> startup
> >> script
> >> >>>>> doesn't change it to somewhere that the couchdb user can
read, you
> >> get
> >> >>>>> spurious eacces errors.
> >> >>>>>>
> >> >>>>>> Don't ask me how I know this.
> >> >>>>>>
> >> >>>>>> B.
> >> >>>>>>
> >> >>>>>> On 16 Aug 2012, at 20:19, Tim Tisdall wrote:
> >> >>>>>>
> >> >>>>>>> Paul, did you ever solve the eaccess problem you
had described
> >> here:
> >> >>>>>>>
> >> >>>>>
> >>
> http://mail-archives.apache.org/mod_mbox/couchdb-user/201106.mbox/%3C4E0B304F.5080109@lymegreen.co.uk%3E
> >> >>>>>>> I found that post from doing Google searches for
my issue.
> >> >>>>>>>
> >> >>>>>>> On Tue, Aug 14, 2012 at 11:41 PM, Paul Davis
> >> >>>>>>> <paul.joseph.davis@gmail.com> wrote:
> >> >>>>>>>> On Tue, Aug 14, 2012 at 9:38 PM, Tim Tisdall
<
> tisdall@gmail.com>
> >> >>>>> wrote:
> >> >>>>>>>>> I'm still having problems with couchdb,
but I'm trying out
> >> different
> >> >>>>>>>>> things to see if I can narrow down what
the problem is...
> >> >>>>>>>>>
> >> >>>>>>>>> I stopped using fsockopen() in PHP and
am using curl now to
> >> hopefully
> >> >>>>>>>>> be able to see more debugging info.
> >> >>>>>>>>>
> >> >>>>>>>>> I get an empty response when sending a
POST to _bulk_docs.
>  From
> >> the
> >> >>>>>>>>> couch logs it seems like the server restarts
in the middle of
> >> >>>>>>>>> processing the request.  Here's what I
have in my logs:  (I
> have
> >> no
> >> >>>>>>>>> idea what the _replicator portion is about
there, I'm
> currently
> >> not
> >> >>>>>>>>> using it)
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> [Wed, 15 Aug 2012 02:27:30 GMT] [debug]
[<0.1255.0>] 'POST'
> >> >>>>>>>>> /app_stats_test/_bulk_docs {1,0} from "127.0.0.1"
> >> >>>>>>>>> Headers: [{'Accept',"*/*"},
> >> >>>>>>>>>         {'Content-Length',"2802300"},
> >> >>>>>>>>>         {'Content-Type',"application/json"},
> >> >>>>>>>>>         {'Host',"localhost:5984"}]
> >> >>>>>>>>> [Wed, 15 Aug 2012 02:27:30 GMT] [debug]
[<0.1255.0>] OAuth
> >> Params: []
> >> >>>>>>>>> [Wed, 15 Aug 2012 02:27:45 GMT] [debug]
[<0.115.0>] Include
> Doc:
> >> >>>>>>>>> <<"_design/_replicator">> {1,
> >> >>>>>>>>>
> >> >>>>> <<91,250,44,153,
> >> >>>>>>>>>
> >> >>>>> 238,254,43,46,
> >> >>>>>>>>>
> >> >>>>> 180,150,45,181,
> >> >>>>>>>>>
> >> >>>>> 10,163,207,212>>}
> >> >>>>>>>>> [Wed, 15 Aug 2012 02:27:45 GMT] [info]
[<0.32.0>] Apache
> CouchDB
> >> has
> >> >>>>>>>>> started on http://127.0.0.1:5984/
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> In my code logs I have the following by
running curl in
> verbose
> >> mode:
> >> >>>>>>>>>
> >> >>>>>>>>> * About to connect() to localhost port
5984 (#0)
> >> >>>>>>>>> *   Trying 127.0.0.1... * connected
> >> >>>>>>>>> * Connected to localhost (127.0.0.1) port
5984 (#0)
> >> >>>>>>>>>> POST /app_stats_test/_bulk_docs HTTP/1.0
> >> >>>>>>>>> Host: localhost:5984
> >> >>>>>>>>> Accept: */*
> >> >>>>>>>>> Content-Type: application/json
> >> >>>>>>>>> Content-Length: 2802300
> >> >>>>>>>>>
> >> >>>>>>>>> * Empty reply from server
> >> >>>>>>>>> * Connection #0 to host localhost left
intact
> >> >>>>>>>>> curl error: 52 : Empty reply from server
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> I also tried using HTTP/1.1 and I get an
empty response after
> >> >>>>>>>>> receiving only a "100 Continue", but the
end result appears
> the
> >> same.
> >> >>>>>>>>>
> >> >>>>>>>>> -Tim
> >> >>>>>>>>
> >> >>>>>>>> If you have a request that triggers this, a
good way to catch
> it
> >> is
> >> >>>>> like such:
> >> >>>>>>>>
> >> >>>>>>>>   $ /usr/local/bin/couchdb # or however you
start it
> >> >>>>>>>>   $ ps ax | grep beam.smp # Get the pid of
couchdb
> >> >>>>>>>>   $ gdb
> >> >>>>>>>>      (gdb) attach $pid # Where $pid was just
found with ps.
> Might
> >> >>>>>>>> throw up an access prompt
> >> >>>>>>>>      (gdb) continue
> >> >>>>>>>>      # At this point, run the command that
makes couchdb reboot
> >> in a
> >> >>>>>>>>      # different console. If it happens you
should see Gdb
> notice
> >> the
> >> >>>>>>>>      # error. Then the following:
> >> >>>>>>>>      (gdb) t a a bt
> >> >>>>>>>>
> >> >>>>>>>> And that should spew out a bunch of stack traces.
If you can
> get
> >> that
> >> >>>>>>>> we should be able to fairly specifically narrow
down the issue.
> >> >>>>>>
> >> >>>>>
> >> >>
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message