incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kocoloski <kocol...@apache.org>
Subject Re: Help understanding crash log
Date Thu, 01 May 2014 18:31:32 GMT
Sure, here are a few rules of thumb:

* 1 process per inbound TCP connection
* 4 processes per open DB (up to [couchdb] max_dbs_open DBs will be kept open simultaneously)
* 3 processes per open view group (I might be off by one or two here)

More Erlang processes require more RAM, so don't go crazy.

Adam

On May 1, 2014, at 12:24 PM, Herman Chan <hermanccw@gmail.com> wrote:

> Thanks Adam,
> 
> We just tried that and it seems to hold up.  Just wondering if there is some kind of
formula on what to set ERL_FLAGS to?
> 
> Herman
> On 2014-05-01, at 10:51 AM, Adam Kocoloski <kocolosk@apache.org> wrote:
> 
>> Hi Herman, I think those are just the view groups shutting down after the parent
DB crashed because you ran out of processes.
>> 
>> You can increase the maximum number of processes via the ERL_FLAGS environment variable,
e.g.
>> 
>>> $ ERL_FLAGS="+P 512000" erl
>>> Erlang R14B01 (erts-5.8.2) [source] [64-bit] [smp:4:4] [rq:4] [async-threads:0]
[hipe] [kernel-poll:false]
>>> 
>>> Eshell V5.8.2  (abort with ^G)
>>> 1> erlang:system_info(process_limit).
>>> 512000
>> 
>> The default is 256k, assuming you've got enough RAM you can bump that up to 1M with
impunity. Regards,
>> 
>> Adam
>> 
>> On May 1, 2014, at 10:43 AM, Herman Chan <hermanccw@gmail.com> wrote:
>> 
>>> We do have 1000+ connection to the db, which we are trying to dial down.  However,
even with lower connection, we hit the crash again, this time I was able to get a better log.
 You are right that we are hitting some limit,
>>> 
>>> before the crash, the log shows that couch is still trying to open up index from
a reboot that we did.  Once it crash, the log start print out with "Index shutdown by monitor".
 Is there any limit parameter that we can increase?
>>> 
>>> [Thu, 01 May 2014 14:28:04 GMT] [error] [emulator] Too many processes
>>> [Thu, 01 May 2014 14:28:04 GMT] [error] [emulator] Error in process <0.3672.477>
with exit value: {system_limit,[{erlang,spawn_opt,[proc_lib,init_p,[<0.3672.477>,[],gen,init_it,[
>>> gen_server,<0.3672.477>,<0.3672.477>,couch_db,{<<42 bytes>>,"/usr/local/var/lib/couchdb/group_370c0635-e593-45ed-ac96-75e6b318cb35.couch",<0.21556.480>,[{user_ctx,{user_ctx,null,
>>> [<<6 bytes>>],undefined... 
>>> 
>>> 
>>> [Thu, 01 May 2014 14:28:04 GMT] [error] [<0.21556.480>] ** Generic server
<0.21556.480> terminating 
>>> ** Last message in was {'EXIT',<0.3672.477>,
>>>                      {system_limit,
>>>                       [{erlang,spawn_opt,
>>>                         [proc_lib,init_p,
>>>                          [<0.3672.477>,[],gen,init_it,
>>>                           [gen_server,<0.3672.477>,<0.3672.477>,couch_db,
>>>                            {<<"group_370c0635-e593-45ed-ac96-75e6b318cb35">>,
>>>                             "/usr/local/var/lib/couchdb/group_370c0635-e593-45ed-ac96-75e6b318cb35.couch",
>>>                             <0.21556.480>,
>>>                             [{user_ctx,
>>>                               {user_ctx,null,[<<"_admin">>],undefined}}]},
>>>                            []]],
>>>                          [link]]},
>>>                        {proc_lib,start_link,5},
>>>                        {couch_db,start_link,3},
>>>                        {couch_server,'-open_async/5-fun-0-',4}]}}
>>> ** When Server state == {file,
>>>                          {file_descriptor,prim_file,
>>>                              {#Port<0.898531>,307709}},
>>>                          1261681}
>>> ** Reason for termination == 
>>> ** {system_limit,
>>>     [{erlang,spawn_opt,
>>>          [proc_lib,init_p,
>>>           [<0.3672.477>,[],gen,init_it,
>>>            [gen_server,<0.3672.477>,<0.3672.477>,couch_db,
>>>             {<<"group_370c0635-e593-45ed-ac96-75e6b318cb35">>,
>>>              "/usr/local/var/lib/couchdb/group_370c0635-e593-45ed-ac96-75e6b318cb35.couch",
>>>              <0.21556.480>,
>>>              [{user_ctx,{user_ctx,null,[<<"_admin">>],undefined}}]},
>>>             []]],
>>>           [link]]},
>>>      {proc_lib,start_link,5},
>>>      {couch_db,start_link,3},
>>>      {couch_server,'-open_async/5-fun-0-',4}]}
>>> 
>>> [Thu, 01 May 2014 14:28:04 GMT] [error] [<0.21556.480>] {error_report,<0.31.0>,
>>>                       {<0.21556.480>,crash_report,
>>>                        [[{initial_call,{couch_file,init,['Argument__1']}},
>>>                          {pid,<0.21556.480>},
>>>                          {registered_name,[]},
>>>                          {error_info,
>>>                           {exit,
>>>                            {system_limit,
>>>                             [{erlang,spawn_opt,
>>>                               [proc_lib,init_p,
>>>                                [<0.3672.477>,[],gen,init_it,
>>>                                 [gen_server,<0.3672.477>,<0.3672.477>,
>>>                                  couch_db,
>>>                                  {<<"group_370c0635-e593-45ed-ac96-75e6b318cb35">>,
>>>                                   "/usr/local/var/lib/couchdb/group_370c0635-e593-45ed-ac96-75e6b318cb35.couch",
>>>                                   <0.21556.480>,
>>>                                   [{user_ctx,
>>>                                     {user_ctx,null,
>>>                                      [<<"_admin">>],
>>>                                      undefined}}]},
>>>                                  []]],
>>>                                [link]]},
>>>                              {proc_lib,start_link,5},
>>>                              {couch_db,start_link,3},
>>>                              {couch_server,'-open_async/5-fun-0-',4}]},
>>>                            [{gen_server,terminate,6},
>>>                             {proc_lib,init_p_do_apply,3}]}},
>>>                          {ancestors,[<0.3672.477>]},
>>>                          {messages,[]},
>>>                          {links,[]},
>>>                          {dictionary,[]},
>>>                          {trap_exit,true},
>>>                          {status,running},
>>>                          {heap_size,610},
>>>                          {stack_size,24},
>>>                          {reductions,973}],
>>>                         []]}}
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.20971.87>] Index shutdown by
monitor notice for db: group_5747d16f-4b3b-4522-af10-1dc7d0d644aa idx: _design/filters
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.4883.35>] Index shutdown by
monitor notice for db: group_15ccf331-257d-4b54-b457-997d342816b9 idx: _design/hub
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.4892.35>] Index shutdown by
monitor notice for db: group_15ccf331-257d-4b54-b457-997d342816b9 idx: _design/filters
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.12040.33>] Index shutdown by
monitor notice for db: group_d006a71d-b0de-4d71-b2f7-06abeeb34e00 idx: _design/filters
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.20971.87>] Closing index for
db: group_5747d16f-4b3b-4522-af10-1dc7d0d644aa idx: _design/filters sig: "3e823c2a4383ac0c18d4e574135a5b08"
>>> reason: normal
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.12032.33>] Index shutdown by
monitor notice for db: group_d006a71d-b0de-4d71-b2f7-06abeeb34e00 idx: _design/hub
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.4892.35>] Closing index for
db: group_15ccf331-257d-4b54-b457-997d342816b9 idx: _design/filters sig: "3e823c2a4383ac0c18d4e574135a5b08"
>>> reason: normal
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.4292.4>] Index shutdown by monitor
notice for db: group_ae50933f-de22-4879-9624-b760106060b3 idx: _design/filters
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.4285.4>] Index shutdown by monitor
notice for db: group_ae50933f-de22-4879-9624-b760106060b3 idx: _design/hub
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.20956.87>] Index shutdown by
monitor notice for db: group_5747d16f-4b3b-4522-af10-1dc7d0d644aa idx: _design/hub
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.4883.35>] Closing index for
db: group_15ccf331-257d-4b54-b457-997d342816b9 idx: _design/hub sig: "4f6edcabc4b7a6357b714e1391ed93ac"
>>> reason: normal
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.12040.33>] Closing index for
db: group_d006a71d-b0de-4d71-b2f7-06abeeb34e00 idx: _design/filters sig: "3e823c2a4383ac0c18d4e574135a5b08"
>>> reason: normal
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.18850.44>] Index shutdown by
monitor notice for db: group_721d99a3-2257-48d0-8a1e-89294874d06e idx: _design/filters
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.18842.44>] Index shutdown by
monitor notice for db: group_721d99a3-2257-48d0-8a1e-89294874d06e idx: _design/hub
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.12032.33>] Closing index for
db: group_d006a71d-b0de-4d71-b2f7-06abeeb34e00 idx: _design/hub sig: "4f6edcabc4b7a6357b714e1391ed93ac"
>>> reason: normal
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.27768.43>] Index shutdown by
monitor notice for db: group_56a0df90-c79e-4863-ae71-2bde3cb0d801 idx: _design/hub
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.27775.43>] Index shutdown by
monitor notice for db: group_56a0df90-c79e-4863-ae71-2bde3cb0d801 idx: _design/filters
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.4292.4>] Closing index for db:
group_ae50933f-de22-4879-9624-b760106060b3 idx: _design/filters sig: "3e823c2a4383ac0c18d4e574135a5b08"
>>> reason: normal
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.4285.4>] Closing index for db:
group_ae50933f-de22-4879-9624-b760106060b3 idx: _design/hub sig: "4f6edcabc4b7a6357b714e1391ed93ac"
>>> reason: normal
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.6010.43>] Index shutdown by
monitor notice for db: group_7f082ae6-f41d-4a14-a836-2360303b2e9a idx: _design/filters
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.6003.43>] Index shutdown by
monitor notice for db: group_7f082ae6-f41d-4a14-a836-2360303b2e9a idx: _design/hub
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.20956.87>] Closing index for
db: group_5747d16f-4b3b-4522-af10-1dc7d0d644aa idx: _design/hub sig: "4f6edcabc4b7a6357b714e1391ed93ac"
>>> reason: normal
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.5933.42>] Index shutdown by
monitor notice for db: group_8c49d7e8-b61e-41e5-a220-11df59b9cce4 idx: _design/hub
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.5940.42>] Index shutdown by
monitor notice for db: group_8c49d7e8-b61e-41e5-a220-11df59b9cce4 idx: _design/filters
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.18842.44>] Closing index for
db: group_721d99a3-2257-48d0-8a1e-89294874d06e idx: _design/hub sig: "4f6edcabc4b7a6357b714e1391ed93ac"
>>> reason: normal
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.17529.33>] Index shutdown by
monitor notice for db: group_98ff493c-63e8-4714-9940-ccea514d4b1d idx: _design/hub
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.18850.44>] Closing index for
db: group_721d99a3-2257-48d0-8a1e-89294874d06e idx: _design/filters sig: "3e823c2a4383ac0c18d4e574135a5b08"
>>> reason: normal
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.17536.33>] Index shutdown by
monitor notice for db: group_98ff493c-63e8-4714-9940-ccea514d4b1d idx: _design/filters
>>> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.27768.43>] Closing index for
db: group_56a0df90-c79e-4863-ae71-2bde3cb0d801 idx: _design/hub sig: "4f6edcabc4b7a6357b714e1391ed93ac"
>>> 
>>> On 2014-05-01, at 9:18 AM, Adam Kocoloski <kocolosk@apache.org> wrote:
>>> 
>>>> On May 1, 2014, at 8:47 AM, Interactive Blueprints <p.van.der.eems@interactiveblueprints.nl>
wrote:
>>>> 
>>>>> 2014-05-01 13:14 GMT+02:00 Herman Chan <hermanccw@gmail.com>:
>>>>>> Thanks Adam,
>>>>>> 
>>>>>> It seems like it is happening again, with more info this time.  It
looks like I am hitting some sort of system limit, can anyone point out where to look next?
>>>>> 
>>>>> Just guessing here..
>>>>> What could be is that you hit the max open file limit of your system.
>>>>> With "ulimit -a" you can see the limits on your system.
>>>>> Usually the max open file limit is somewhere around 1024.
>>>>> I noticed that couchdb loves to have a lot of files open simultaneously.
>>>>> 
>>>>> Iin the same shell you start couchdb, right before you start couchdb,
>>>>> you can do a "ulimit -a 4096" (or another large value), this should
>>>>> give coudhb the ability to open more files.
>>>>> 
>>>>> Hope this helps.
>>>>> 
>>>>> Pieter van der Eems
>>>>> Interactive Blueprints
>>>> 
>>>> That's a good thought Pieter, though typically in that case you'll see an
'emfile' error in the logs. This particular system_limit error (with {erlang, spawn_link,
...} following it) occurs when the Erlang VM has reached the maximum number of processes it's
allowed to spawn. Judging from the *long* list of processes linked to couch_httpd in this
stacktrace I'd say Herman's client is improperly leaving connections open. Herman, did you
intend to have 1000s of open TCP connections on this server? Regards,
>>>> 
>>>> Adam
>>> 
>> 
> 


Mime
View raw message