couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joan Touzet <woh...@apache.org>
Subject Re: Crashing due to memory use
Date Tue, 31 Jan 2017 21:00:43 GMT
Hi Tayven,

> Jan,

Joan, actually. Jan is also on this thread. :)

A few things stand out here. I'm going to heavily trim your
emails for clarity.

----- Original Message -----

> At the time of crash the kernel is reporting that beam.smp is
> consuming 62G of memory + 32G of swap.

This is unusual. And you don't see similar high memory use for
couchjs processes?

I recommend reducing or disabling swap for best performance. 

Another option is to edit the oom adjuster for beam.smp versus
other processes, such as couchjs. This is done through the
tunable /proc/<pid>/oom_score_adj, setting the value strongly
negative for beam.smp (range is -1000 to +1000) and setting it
mildly higher for couchjs. Documentation for this is at

https://www.kernel.org/doc/Documentation/filesystems/proc.txt

in section 3.1.

> In Local.ini the changes from the base file are:
[snip]

>  max_connections = 1024
Presuming this is in your [httpd] section, it won't have much
effect, since this only affects the old interface (running on 
port 5986).

Using netstat, how many active connections do you have open on a
server when beam.smp is eating lots of RAM? On which ports? A
summary report would be useful.

>  max_dbs_open = 500

How many databases do you actually have on the machine? You
mention two databases, but it's unclear to me how many are actually
resident. This includes any backup copies of your database you may
have in place visible to CouchDB.

>  nodejs = /usr/local/bin/node /home/couchdb/couchdb/share/server/main.js

Apache CouchDB considers this view server experimental. You run it
at your own risk. Though, if this was at fault, I'd expect to see
nodejs processes consuming more RAM and CPU resources than beam.smp
itself. Also, you'd have to be declaring your view's language as
nodejs instead of javascript, which you're not doing per your sample
design document.

> The memory leak happens when we kick off a new view.

Is this a view you've ever run in production on Cloudant, or
something new you're trying only on your local instance? Is this
view perhaps using the experimental nodejs view server?

-----

Are you launching couchdb with any special flags being passed to
the Erlang erl process besides ERL_MAX_PORTS?

Note that in recent versions of Erlang, ERL_MAX_PORTS has been 
replaced by the +Q flag. ERL_MAX_PORTS has no effect on these
newer versions. Check the documentation for your specific version
of Erlang.

Recommendation:

If you're going to be running a big cluster on your own, read Fred
Hebert's great free book Stuff Goes Bad: Erlang in Anger.

  http://www.erlang-in-anger.com/

and pay special attention to chapters 4, 5 & 7. Specifically, if
you can get on the node during periods of high memory usage with a
remsh:

$ erl -setcookie <cookie> -name tayven@localhost \
  -remsh couchdb@localhost -hidden

and at least monitor the output of:

  1> erlang:memory().
  2> ets:i().

and, if you add Fred's recon to your install,

  3> recon:proc_count(memory, 3).
  4> recon:proc_count(binary_memory, 3).

we'll know more.

We don't have a smoking gun yet, but hopefully with more data, we
can help you narrow in on one.

-Joan

Mime
View raw message