incubator-couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sander Dijkhuis (JIRA)" <j...@apache.org>
Subject [jira] [Created] (COUCHDB-1757) CouchDB 1.3.0rc3 crashes when _replicator contains a lot of docs
Date Tue, 02 Apr 2013 18:01:19 GMT
Sander Dijkhuis created COUCHDB-1757:
----------------------------------------

             Summary: CouchDB 1.3.0rc3 crashes when _replicator contains a lot of docs
                 Key: COUCHDB-1757
                 URL: https://issues.apache.org/jira/browse/COUCHDB-1757
             Project: CouchDB
          Issue Type: Bug
          Components: Database Core
            Reporter: Sander Dijkhuis


I’m deploying an experimental game based on CouchDB with one user per database. For access
control, I’m using several _replicator docs per user:
- one filtered replication from the shared db to the user db,
- one unfiltered replication from the user db to the shared db,
- two replications using doc_ids per ‘friendship’ (to share both profiles).

At the moment, this results in 420 continuous replications running. CouchDB 1.3.0rc3 on Ubuntu
crashes a couple of seconds after starting, and doesn’t crash when I temporarily remove
the _replicator database. When I used 1.3.0rc1, CouchDB would crash after a few minutes to
a few hours.

Some details from the crash report are below, filtered for privacy, to avoid repetition and
to hide the _design doc that’s shown in the log. Let me know if you need more detail or
if I should share one of the _design functions used.

Am I abusing the replication system, or can I change a setting to allow for longer timeouts?

--

First, I get something like this for each _replicator doc:
{code}
[info] [<0.5368.0>] Replication `"5529b4bdb9c5bdc15b558bd7588511d9+continuous"` is using:
	4 worker processes
	a worker batch size of 500
	20 HTTP connections
	a connection timeout of 30000 milliseconds
	10 retries per request
	socket options are: [{keepalive,true},{nodelay,false}]
	source start sequence 6908
[info] [<0.5368.0>] Document `lunacy:to:USERNAME` triggered replication `5529b4bdb9c5bdc15b558bd7588511d9+continuous`
[info] [<0.1213.0>] starting new replication `5529b4bdb9c5bdc15b558bd7588511d9+continuous`
at <0.5368.0> (`lunacy` -> `lunacy/user/USERNAME`)
{code}
Then:
{code}
[error] [<0.5408.0>] OS Process died with status: 137
[error] [<0.5408.0>] ** Generic server <0.5408.0> terminating 
** Last message in was {#Port<0.2740>,{exit_status,137}}
** When Server state == {os_proc,"/home/sander/git/apache-couchdb-1.3.0/build/bin/couchjs
/home/sander/git/apache-couchdb-1.3.0/build/share/couchdb/server/main.js",
                                 #Port<0.2740>,
                                 #Fun<couch_os_process.2.132569728>,
                                 #Fun<couch_os_process.3.35601548>,5000}
** Reason for termination == 
** {exit_status,137}
{code}
Followed by:
{code}
=ERROR REPORT==== 2-Apr-2013::19:18:20 ===
** Generic server <0.5408.0> terminating 
** Last message in was {#Port<0.2740>,{exit_status,137}}
** When Server state == {os_proc,"/home/sander/git/apache-couchdb-1.3.0/build/bin/couchjs
/home/sander/git/apache-couchdb-1.3.0/build/share/couchdb/server/main.js",
                                 #Port<0.2740>,
                                 #Fun<couch_os_process.2.132569728>,
                                 #Fun<couch_os_process.3.35601548>,5000}
** Reason for termination == 
** {exit_status,137}
[error] [<0.5408.0>] {error_report,<0.31.0>,
                         {<0.5408.0>,crash_report,
                          [[{initial_call,
                                {couch_os_process,init,['Argument__1']}},
                            {pid,<0.5408.0>},
                            {registered_name,[]},
                            {error_info,
                                {exit,
                                    {exit_status,137},
                                    [{gen_server,terminate,6},
                                     {proc_lib,init_p_do_apply,3}]}},
                            {ancestors,
                                [couch_query_servers,couch_secondary_services,
                                 couch_server_sup,<0.32.0>]},
                            {messages,[]},
                            {links,[<0.111.0>,<0.5339.0>]},
                            {dictionary,[]},
                            {trap_exit,false},
                            {status,running},
                            {heap_size,1597},
                            {stack_size,24},
                            {reductions,1197}],
                           [{neighbour,
                                [{pid,<0.5345.0>},
                                 {registered_name,[]},
                                 {initial_call,
                                     {couch_event_sup,init,['Argument__1']}},
                                 {current_function,{gen_server,loop,6}},
                                 {ancestors,[<0.5339.0>]},
                                 {messages,[]},
                                 {links,[<0.5339.0>,<0.89.0>]},
                                 {dictionary,[]},
                                 {trap_exit,false},
                                 {status,waiting},
                                 {heap_size,987},
                                 {stack_size,9},
                                 {reductions,32}]},
                            {neighbour,
                                [{pid,<0.5339.0>},
                                 {registered_name,[]},
                                 {initial_call,{erlang,apply,2}},
                                 {current_function,{gen,do_call,4}},
                                 {ancestors,[]},
                                 {messages,[]},
                                 {links,[<0.5345.0>,<0.5408.0>,<0.5335.0>]},
                                 {dictionary,[]},
                                 {trap_exit,false},
                                 {status,waiting},
                                 {heap_size,6765},
                                 {stack_size,104},
                                 {reductions,1988}]}]]}}

=CRASH REPORT==== 2-Apr-2013::19:18:21 ===
  crasher:
    initial call: couch_os_process:init/1
    pid: <0.5408.0>
    registered_name: []
    exception exit: {exit_status,137}
      in function  gen_server:terminate/6
    ancestors: [couch_query_servers,couch_secondary_services,
                  couch_server_sup,<0.32.0>]
    messages: []
    links: [<0.111.0>,<0.5339.0>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 1597
    stack_size: 24
    reductions: 1197
  neighbours:
    neighbour: [{pid,<0.5345.0>},
                  {registered_name,[]},
                  {initial_call,{couch_event_sup,init,['Argument__1']}},
                  {current_function,{gen_server,loop,6}},
                  {ancestors,[<0.5339.0>]},
                  {messages,[]},
                  {links,[<0.5339.0>,<0.89.0>]},
                  {dictionary,[]},
                  {trap_exit,false},
                  {status,waiting},
                  {heap_size,987},
                  {stack_size,9},
                  {reductions,32}]
    neighbour: [{pid,<0.5339.0>},
                  {registered_name,[]},
                  {initial_call,{erlang,apply,2}},
                  {current_function,{gen,do_call,4}},
                  {ancestors,[]},
                  {messages,[]},
                  {links,[<0.5345.0>,<0.5408.0>,<0.5335.0>]},
                  {dictionary,[]},
                  {trap_exit,false},
                  {status,waiting},
                  {heap_size,6765},
                  {stack_size,104},
                  {reductions,1988}]
[error] [<0.5335.0>] ChangesReader process died with reason: {exit_status,137}
[error] [<0.111.0>] OS Process Error <0.5412.0> :: {os_process_error,
                                                    "OS process timed out."}
[error] [<0.5387.0>] OS Process died with status: 137
[error] [<0.5385.0>] OS Process died with status: 137
[error] [<0.5335.0>] Replication `f7ecf7f435811899c912619f899f24b4+continuous` (`lunacy`
-> `lunacy/user/USERNAME`) failed: changes_reader_died
[error] [<0.5258.0>] ChangesReader process died with reason: shutdown
[error] [<0.5387.0>] ** Generic server <0.5387.0> terminating 
** Last message in was {#Port<0.2730>,{exit_status,137}}
** When Server state == {os_proc,"/home/sander/git/apache-couchdb-1.3.0/build/bin/couchjs
/home/sander/git/apache-couchdb-1.3.0/build/share/couchdb/server/main.js",
                                 #Port<0.2730>,
                                 #Fun<couch_os_process.2.132569728>,
                                 #Fun<couch_os_process.3.35601548>,5000}
** Reason for termination == 
** {exit_status,137}

=ERROR REPORT==== 2-Apr-2013::19:18:21 ===
** Generic server <0.5387.0> terminating 
** Last message in was {#Port<0.2730>,{exit_status,137}}
** When Server state == {os_proc,"/home/sander/git/apache-couchdb-1.3.0/build/bin/couchjs
/home/sander/git/apache-couchdb-1.3.0/build/share/couchdb/server/main.js",
                                 #Port<0.2730>,
                                 #Fun<couch_os_process.2.132569728>,
                                 #Fun<couch_os_process.3.35601548>,5000}
** Reason for termination == 
** {exit_status,137}
[error] [<0.5385.0>] ** Generic server <0.5385.0> terminating 
** Last message in was {#Port<0.2729>,{exit_status,137}}
** When Server state == {os_proc,"/home/sander/git/apache-couchdb-1.3.0/build/bin/couchjs
/home/sander/git/apache-couchdb-1.3.0/build/share/couchdb/server/main.js",
                                 #Port<0.2729>,
                                 #Fun<couch_os_process.2.132569728>,
                                 #Fun<couch_os_process.3.35601548>,5000}
** Reason for termination == 
** {exit_status,137}

=ERROR REPORT==== 2-Apr-2013::19:18:21 ===
** Generic server <0.5385.0> terminating 
** Last message in was {#Port<0.2729>,{exit_status,137}}
** When Server state == {os_proc,"/home/sander/git/apache-couchdb-1.3.0/build/bin/couchjs
/home/sander/git/apache-couchdb-1.3.0/build/share/couchdb/server/main.js",
                                 #Port<0.2729>,
                                 #Fun<couch_os_process.2.132569728>,
                                 #Fun<couch_os_process.3.35601548>,5000}
** Reason for termination == 
** {exit_status,137}
[error] [<0.5385.0>] {error_report,<0.31.0>,
                         {<0.5385.0>,crash_report,
                          [[{initial_call,
                                {couch_os_process,init,['Argument__1']}},
                            {pid,<0.5385.0>},
                            {registered_name,[]},
                            {error_info,
                                {exit,
                                    {exit_status,137},
                                    [{gen_server,terminate,6},
                                     {proc_lib,init_p_do_apply,3}]}},
                            {ancestors,
                                [couch_query_servers,couch_secondary_services,
                                 couch_server_sup,<0.32.0>]},
                            {messages,[]},
                            {links,[<0.111.0>,<0.5207.0>]},
                            {dictionary,[]},
                            {trap_exit,false},
                            {status,running},
                            {heap_size,1597},
                            {stack_size,24},
                            {reductions,1205}],
                           [{neighbour,
                                [{pid,<0.5213.0>},
                                 {registered_name,[]},
                                 {initial_call,
                                     {couch_event_sup,init,['Argument__1']}},
                                 {current_function,{gen_server,loop,6}},
                                 {ancestors,[<0.5207.0>]},
                                 {messages,[]},
                                 {links,[<0.5207.0>,<0.89.0>]},
                                 {dictionary,[]},
                                 {trap_exit,false},
                                 {status,waiting},
                                 {heap_size,987},
                                 {stack_size,9},
                                 {reductions,32}]},
                            {neighbour,
                                [{pid,<0.5207.0>},
                                 {registered_name,[]},
                                 {initial_call,{erlang,apply,2}},
                                 {current_function,{gen,do_call,4}},
                                 {ancestors,[]},
                                 {messages,[]},
                                 {links,[<0.5213.0>,<0.5385.0>,<0.5203.0>]},
                                 {dictionary,[]},
                                 {trap_exit,false},
                                 {status,waiting},
                                 {heap_size,6765},
                                 {stack_size,104},
                                 {reductions,1988}]}]]}}

=CRASH REPORT==== 2-Apr-2013::19:18:22 ===
  crasher:
    initial call: couch_os_process:init/1
    pid: <0.5385.0>
    registered_name: []
    exception exit: {exit_status,137}
      in function  gen_server:terminate/6
    ancestors: [couch_query_servers,couch_secondary_services,
                  couch_server_sup,<0.32.0>]
    messages: []
    links: [<0.111.0>,<0.5207.0>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 1597
    stack_size: 24
    reductions: 1205
  neighbours:
    neighbour: [{pid,<0.5213.0>},
                  {registered_name,[]},
                  {initial_call,{couch_event_sup,init,['Argument__1']}},
                  {current_function,{gen_server,loop,6}},
                  {ancestors,[<0.5207.0>]},
                  {messages,[]},
                  {links,[<0.5207.0>,<0.89.0>]},
                  {dictionary,[]},
                  {trap_exit,false},
                  {status,waiting},
                  {heap_size,987},
                  {stack_size,9},
                  {reductions,32}]
    neighbour: [{pid,<0.5207.0>},
                  {registered_name,[]},
                  {initial_call,{erlang,apply,2}},
                  {current_function,{gen,do_call,4}},
                  {ancestors,[]},
                  {messages,[]},
                  {links,[<0.5213.0>,<0.5385.0>,<0.5203.0>]},
                  {dictionary,[]},
                  {trap_exit,false},
                  {status,waiting},
                  {heap_size,6765},
                  {stack_size,104},
                  {reductions,1988}]
[error] [<0.5387.0>] {error_report,<0.31.0>,
                         {<0.5387.0>,crash_report,
                          [[{initial_call,
                                {couch_os_process,init,['Argument__1']}},
                            {pid,<0.5387.0>},
                            {registered_name,[]},
                            {error_info,
                                {exit,
                                    {exit_status,137},
                                    [{gen_server,terminate,6},
                                     {proc_lib,init_p_do_apply,3}]}},
                            {ancestors,
                                [couch_query_servers,couch_secondary_services,
                                 couch_server_sup,<0.32.0>]},
                            {messages,[]},
                            {links,[<0.111.0>,<0.5218.0>]},
                            {dictionary,[]},
                            {trap_exit,false},
                            {status,running},
                            {heap_size,1597},
                            {stack_size,24},
                            {reductions,1205}],
                           [{neighbour,
                                [{pid,<0.5224.0>},
                                 {registered_name,[]},
                                 {initial_call,
                                     {couch_event_sup,init,['Argument__1']}},
                                 {current_function,{gen_server,loop,6}},
                                 {ancestors,[<0.5218.0>]},
                                 {messages,[]},
                                 {links,[<0.5218.0>,<0.89.0>]},
                                 {dictionary,[]},
                                 {trap_exit,false},
                                 {status,waiting},
                                 {heap_size,987},
                                 {stack_size,9},
                                 {reductions,32}]},
                            {neighbour,
                                [{pid,<0.5218.0>},
                                 {registered_name,[]},
                                 {initial_call,{erlang,apply,2}},
                                 {current_function,{gen,do_call,4}},
                                 {ancestors,[]},
                                 {messages,[]},
                                 {links,[<0.5224.0>,<0.5387.0>,<0.5214.0>]},
                                 {dictionary,[]},
                                 {trap_exit,false},
                                 {status,waiting},
                                 {heap_size,6765},
                                 {stack_size,104},
                                 {reductions,1947}]}]]}}

=CRASH REPORT==== 2-Apr-2013::19:18:24 ===
  crasher:
    initial call: couch_os_process:init/1
    pid: <0.5387.0>
    registered_name: []
    exception exit: {exit_status,137}
      in function  gen_server:terminate/6
    ancestors: [couch_query_servers,couch_secondary_services,
                  couch_server_sup,<0.32.0>]
    messages: []
    links: [<0.111.0>,<0.5218.0>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 1597
    stack_size: 24
    reductions: 1205
  neighbours:
    neighbour: [{pid,<0.5224.0>},
                  {registered_name,[]},
                  {initial_call,{couch_event_sup,init,['Argument__1']}},
                  {current_function,{gen_server,loop,6}},
                  {ancestors,[<0.5218.0>]},
                  {messages,[]},
                  {links,[<0.5218.0>,<0.89.0>]},
                  {dictionary,[]},
                  {trap_exit,false},
                  {status,waiting},
                  {heap_size,987},
                  {stack_size,9},
                  {reductions,32}]
    neighbour: [{pid,<0.5218.0>},
                  {registered_name,[]},
                  {initial_call,{erlang,apply,2}},
                  {current_function,{gen,do_call,4}},
                  {ancestors,[]},
                  {messages,[]},
                  {links,[<0.5224.0>,<0.5387.0>,<0.5214.0>]},
                  {dictionary,[]},
                  {trap_exit,false},
                  {status,waiting},
                  {heap_size,6765},
                  {stack_size,104},
                  {reductions,1947}]
[error] [<0.5302.0>] ChangesReader process died with reason: shutdown
[error] [<0.5192.0>] ChangesReader process died with reason: shutdown
[error] [<0.5203.0>] ChangesReader process died with reason: {exit_status,137}
[error] [<0.5214.0>] ChangesReader process died with reason: {exit_status,137}
[error] [<0.3692.0>] ChangesReader process died with reason: shutdown
[error] [<0.5258.0>] Replication `3d6539a2a9e3201a6eacd0b7db4c7dd3+continuous` (`lunacy`
-> `lunacy/user/USERNAME`) failed: changes_reader_died
[error] [<0.5170.0>] ChangesReader process died with reason: shutdown
[error] [<0.5236.0>] ChangesReader process died with reason: shutdown
[error] [<0.5280.0>] ChangesReader process died with reason: shutdown
[error] [<0.5225.0>] ChangesReader process died with reason: shutdown
[error] [<0.5324.0>] ChangesReader process died with reason: shutdown
[error] [<0.5291.0>] ChangesReader process died with reason: shutdown
[error] [<0.5313.0>] ChangesReader process died with reason: shutdown
[error] [<0.5181.0>] ChangesReader process died with reason: shutdown
[error] [<0.5269.0>] ChangesReader process died with reason: shutdown
[error] [<0.111.0>] ** Generic server couch_query_servers terminating 
** Last message in was {get_proc,{doc,<<"_design/server">>,
                                      {31,
                                       [<<2,129,73,127,145,177,85,156,51,70,79,
                                          122,210,226,20,220>>, (ET CETERA)


                                      [],false,[]},
                                 {<<"_design/server">>,
                                  <<"31-0281497f91b1559c33464f7ad2e214dc">>}}
** When Server state == {qserver,32811,41005,45102,36908,[],
                                 {[{<<"reduce_limit">>,true},
                                   {<<"timeout">>,5000}]}}
** Reason for termination == 
** {bad_return_value,{os_process_error,"OS process timed out."}}
{code}
And finally:
{code}
                          {'$gen_call',
                           {<0.3696.0>,#Ref<0.0.0.31225>},
                           {unlink_proc,<0.3714.0>}},
                          {'$gen_call',
                           {<0.5174.0>,#Ref<0.0.0.31231>},
                           {unlink_proc,<0.5379.0>}},
                          {'$gen_call',
                           {<0.5185.0>,#Ref<0.0.0.31237>},
                           {unlink_proc,<0.5381.0>}},
                          {'$gen_call',
                           {<0.5196.0>,#Ref<0.0.0.31243>},
                           {unlink_proc,<0.5383.0>}},
                          {'$gen_call',
                           {<0.5207.0>,#Ref<0.0.0.31249>},
                           {unlink_proc,<0.5385.0>}},
                          {'$gen_call',
                           {<0.5218.0>,#Ref<0.0.0.31255>},
                           {unlink_proc,<0.5387.0>}},
                          {'$gen_call',
                           {<0.5229.0>,#Ref<0.0.0.31261>},
                           {unlink_proc,<0.5389.0>}},
                          {'$gen_call',
                           {<0.5240.0>,#Ref<0.0.0.31267>},
                           {unlink_proc,<0.5391.0>}},
                          {'$gen_call',
                           {<0.5262.0>,#Ref<0.0.0.31273>},
                           {unlink_proc,<0.5393.0>}},
                          {'$gen_call',
                           {<0.5273.0>,#Ref<0.0.0.31299>},
                           {unlink_proc,<0.5395.0>}},
                          {'$gen_call',
                           {<0.5284.0>,#Ref<0.0.0.31305>},
                           {unlink_proc,<0.5398.0>}},
                          {'$gen_call',
                           {<0.5295.0>,#Ref<0.0.0.31311>},
                           {unlink_proc,<0.5400.0>}},
                          {'$gen_call',
                           {<0.5306.0>,#Ref<0.0.0.31317>},
                           {unlink_proc,<0.5402.0>}},
                          {'$gen_call',
                           {<0.5317.0>,#Ref<0.0.0.31323>},
                           {unlink_proc,<0.5404.0>}},
                          {'$gen_call',
                           {<0.5328.0>,#Ref<0.0.0.31329>},
                           {unlink_proc,<0.5406.0>}},
                          {'$gen_call',
                           {<0.5339.0>,#Ref<0.0.0.31359>},
                           {unlink_proc,<0.5408.0>}},
                          {'EXIT',<0.5408.0>,{exit_status,137}},
                          {'DOWN',#Ref<0.0.0.31331>,process,<0.5408.0>,
                           {exit_status,137}},
                          {'EXIT',<0.5412.0>,normal},
                          {'DOWN',#Ref<0.0.0.31360>,process,<0.5412.0>,normal},
                          {'DOWN',#Ref<0.0.0.31269>,process,<0.5393.0>,
                           shutdown},
                          {'DOWN',#Ref<0.0.0.21467>,process,<0.3714.0>,
                           shutdown},
                          {'DOWN',#Ref<0.0.0.31313>,process,<0.5402.0>,
                           shutdown},
                          {'DOWN',#Ref<0.0.0.31239>,process,<0.5383.0>,
                           shutdown},
                          {'DOWN',#Ref<0.0.0.31245>,process,<0.5385.0>,
                           {exit_status,137}},
                          {'EXIT',<0.5387.0>,{exit_status,137}},
                          {'DOWN',#Ref<0.0.0.31251>,process,<0.5387.0>,
                           {exit_status,137}},
                          {'DOWN',#Ref<0.0.0.31227>,process,<0.5379.0>,
                           shutdown},
                          {'DOWN',#Ref<0.0.0.31263>,process,<0.5391.0>,
                           shutdown},
                          {'DOWN',#Ref<0.0.0.31257>,process,<0.5389.0>,
                           shutdown},
                          {'DOWN',#Ref<0.0.0.31301>,process,<0.5398.0>,
                           shutdown},
                          {'DOWN',#Ref<0.0.0.31325>,process,<0.5406.0>,
                           shutdown},
                          {'DOWN',#Ref<0.0.0.31319>,process,<0.5404.0>,
                           shutdown},
                          {'DOWN',#Ref<0.0.0.31307>,process,<0.5400.0>,
                           shutdown},
                          {'DOWN',#Ref<0.0.0.31233>,process,<0.5381.0>,
                           shutdown},
                          {'DOWN',#Ref<0.0.0.31275>,process,<0.5395.0>,
                           shutdown}]},
                        {links,[<0.94.0>]},
                        {dictionary,[]},
                        {trap_exit,true},
                        {status,running},
                        {heap_size,17711},
                        {stack_size,24},
                        {reductions,7801}],
                       []]}}
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message