incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hristo Deshev <hri...@deshev.com>
Subject CouchDB 1.1.1 mysteriously crashing under heavy load
Date Thu, 08 Dec 2011 21:41:02 GMT
Hi everyone,

I moved some data from an Amazon EC2 small instance to a large one and in
the process upgraded from CouchDB 1.1.0 to CouchDB 1.1.1. I also went with
Erlang R14B04 instead of R14B03 (Hurray for commando updates!) and now my
CouchDB instance seems to sometimes die when under heavy load. By "dying" I
mean that the beam process seems to stay in memory, but the HTTP server is
gone and no requests get served. For now I "fix" this by stopping and
restarting the process.

Here are some details on my setup. The server is running a 64-bit Ubuntu
Server (Oneiric) Amazon EC2 image on a large instance with 2 CPU cores and
7.5 GB RAM. I build both Erlang and CouchDB from source. I collect log
entries and bulk insert them in batches of up to 200 documents. I also run
couchdb-lucene on the same host and I *think* most of the crashes happen
when couchdb-lucene is running a tough query and is hogging the CPU or the
HDD. I have some largish db's (~50 million documents, ~25 GB in disk
space). I plan on splitting my dbs into smaller ones. I hope that gets me
more responsive file access and faster full text index searches. I think my
lucene indexes may be getting too large for that machine's memory and it
can't serve them too well. I frequently get "OS process has timed out"
errors when trying to query those indexes. Anyway, that shouldn't be
crashing the core couchdb process, right?

I am pasting my idea of what the relevant portion of the couchdb log file
is below, hoping somebody could decipher something out of it. Am I correct
in thinking that the "** Reason for termination == ** {timeout," part means
the process is crashing since writing to or reading from a file timed out?
Any help is greatly appreciated.

Best,
Hristo

===============

[Thu, 08 Dec 2011 20:17:16 GMT] [error] [<0.78.0>] {error_report,<0.31.0>,
                       {<0.78.0>,supervisor_report,
                        [{supervisor,{local,couch_server_sup}},
                         {errorContext,child_terminated},
                         {reason,shutdown},
                         {offender,
                             [{pid,<0.86.0>},
                              {name,couch_secondary_services},
                              {mfargs,

{couch_server_sup,start_secondary_services,
                                      []}},
                              {restart_type,permanent},
                              {shutdown,infinity},
                              {child_type,supervisor}]}]}}
[Thu, 08 Dec 2011 20:17:21 GMT] [error] [<0.407.0>] ** Generic server
<0.407.0> terminating
** Last message in was delayed_commit
** When Server state == {db,<0.406.0>,<0.407.0>,nil,<<"1323371423957954">>,
                            <0.404.0>,<0.408.0>,
                            {db_header,5,204982,0,
                                {199491055,{204980,0}},
                                {199498140,204980},
                                {111685732,[]},
                                0,nil,nil,1000},
                            204982,
                            {btree,<0.404.0>,
                                {199513565,{205011,0}},
                                #Fun<couch_db_updater.10.19222179>,
                                #Fun<couch_db_updater.11.21515767>,
                                #Fun<couch_btree.5.112258129>,
                                #Fun<couch_db_updater.12.93888648>},
                            {btree,<0.404.0>,
                                {199518784,205011},
                                #Fun<couch_db_updater.13.40165027>,
                                #Fun<couch_db_updater.14.82810239>,
                                #Fun<couch_btree.5.112258129>,
                                #Fun<couch_db_updater.15.104121193>},
                            {btree,<0.404.0>,
                                {111685732,[]},
                                #Fun<couch_btree.0.23070627>,
                                #Fun<couch_btree.1.117278773>,
                                #Fun<couch_btree.2.112258129>,nil},
                            205013,
                            <<"database1">>,
                            "/data/couchdb/data/database1.couch",
                            [],[],nil,
                            {user_ctx,null,[],undefined},
                            #Ref<0.0.30.131014>,1000,
                            [before_header,after_header,on_file_open],
                            false}
** Reason for termination ==
** {timeout,
       {gen_server,call,
           [<0.406.0>,
            {db_updated,

{db,<0.406.0>,<0.407.0>,nil,<<"1323371423957954">>,<0.404.0>,
                    <0.408.0>,
                    {db_header,5,205013,0,
                        {199513565,{205011,0}},
                        {199518784,205011},
                        {111685732,[]},
                        0,nil,nil,1000},
                    205013,
                    {btree,<0.404.0>,
                        {199513565,{205011,0}},
                        #Fun<couch_db_updater.10.19222179>,
                        #Fun<couch_db_updater.11.21515767>,
                        #Fun<couch_btree.5.112258129>,
                        #Fun<couch_db_updater.12.93888648>},
                    {btree,<0.404.0>,
                        {199518784,205011},
                        #Fun<couch_db_updater.13.40165027>,
                        #Fun<couch_db_updater.14.82810239>,
                        #Fun<couch_btree.5.112258129>,
                        #Fun<couch_db_updater.15.104121193>},
                    {btree,<0.404.0>,
                        {111685732,[]},
                        #Fun<couch_btree.0.23070627>,
                        #Fun<couch_btree.1.117278773>,
                        #Fun<couch_btree.2.112258129>,nil},
                    205013,
                    <<"database1">>,
                    "/data/couchdb/data/database1.couch",
                    [],[],nil,
                    {user_ctx,null,[],undefined},
                    nil,1000,
                    [before_header,after_header,on_file_open],
                    false}}]}}

[Thu, 08 Dec 2011 20:17:21 GMT] [error] [<0.407.0>] {error_report,<0.31.0>,
                     {<0.407.0>,crash_report,

[[{initial_call,{couch_db_updater,init,['Argument__1']}},
                        {pid,<0.407.0>},
                        {registered_name,[]},
                        {error_info,
                         {exit,
                          {timeout,
                           {gen_server,call,
                            [<0.406.0>,
                             {db_updated,
                              {db,<0.406.0>,<0.407.0>,nil,
                               <<"1323371423957954">>,<0.404.0>,<0.408.0>,
                               {db_header,5,205013,0,
                                {199513565,{205011,0}},
                                {199518784,205011},
                                {111685732,[]},
                                0,nil,nil,1000},
                               205013,
                               {btree,<0.404.0>,
                                {199513565,{205011,0}},
                                #Fun<couch_db_updater.10.19222179>,
                                #Fun<couch_db_updater.11.21515767>,
                                #Fun<couch_btree.5.112258129>,
                                #Fun<couch_db_updater.12.93888648>},
                               {btree,<0.404.0>,
                                {199518784,205011},
                                #Fun<couch_db_updater.13.40165027>,
                                #Fun<couch_db_updater.14.82810239>,
                                #Fun<couch_btree.5.112258129>,
                                #Fun<couch_db_updater.15.104121193>},
                               {btree,<0.404.0>,
                                {111685732,[]},
                                #Fun<couch_btree.0.23070627>,
                                #Fun<couch_btree.1.117278773>,
                                #Fun<couch_btree.2.112258129>,nil},
                               205013,
                               <<"database1">>,
                               "/data/couchdb/data/database1.couch",
                               [],[],nil,
                               {user_ctx,null,[],undefined},
                               nil,1000,
                               [before_header,after_header,on_file_open],
                               false}}]}},
                          [{gen_server,terminate,6},
                           {proc_lib,init_p_do_apply,3}]}},
                        {ancestors,[<0.406.0>,<0.403.0>]},
                        {messages,[{'EXIT',<0.406.0>,shutdown}]},
                        {links,[]},
                        {dictionary,[]},
                        {trap_exit,true},
                        {status,running},
                        {heap_size,28657},
                        {stack_size,24},
                        {reductions,4487709}],
                       []]}}
[Thu, 08 Dec 2011 20:17:22 GMT] [error] [<0.178.0>] ** Generic server
<0.178.0> terminating
** Last message in was {update_docs,<0.2027.0>,
                           [[{doc,<<"55e776b94547442ab17b82bd1a059843">>,
                                 {1,
                                  [<<102,77,172,235,192,72,84,223,58,68,105,
                                     199,153,147,196,81>>]},
                                 {[{<<"host">>,<<"Host1">>},
                                   {<<"time">>,1323375464000},
                                   {<<"text">>,
                                    <<"Some text">>},
                                   {<<"level">>,0},
                                   {<<"source">>,<<"source1">>},
                                   {<<"type">>,<<"Entry1">>}]},
                                 [],false,[]}],

...
[[A BUNCH OF DOCS HERE]]
...


                                 {[{<<"host">>,<<"Host1">>},
                                   {<<"time">>,1323375467000},
                                   {<<"text">>,
                                    <<"Some text">>},
                                   {<<"level">>,0},
                                   {<<"source">>,<<"source1">>},
                                   {<<"type">>,<<"Entry1">>}]},
                                 [],false,[]}]],
                           [],false,false}
** When Server state == {db,<0.177.0>,<0.178.0>,nil,<<"1323371411352029">>,
                            <0.175.0>,<0.179.0>,
                            {db_header,5,13636863,0,
                                {6776455960,{13636861,0}},
                                {6776479023,13636861},
                                {1039786,[]},
                                0,nil,nil,1000},
                            13636863,
                            {btree,<0.175.0>,
                                {6776455960,{13636861,0}},
                                #Fun<couch_db_updater.10.19222179>,
                                #Fun<couch_db_updater.11.21515767>,
                                #Fun<couch_btree.5.112258129>,
                                #Fun<couch_db_updater.12.93888648>},
                            {btree,<0.175.0>,
                                {6776479023,13636861},
                                #Fun<couch_db_updater.13.40165027>,
                                #Fun<couch_db_updater.14.82810239>,
                                #Fun<couch_btree.5.112258129>,
                                #Fun<couch_db_updater.15.104121193>},
                            {btree,<0.175.0>,
                                {1039786,[]},
                                #Fun<couch_btree.0.23070627>,
                                #Fun<couch_btree.1.117278773>,
                                #Fun<couch_btree.2.112258129>,nil},
                            13636863,
                            <<"database2">>,
                            "/data/couchdb/data/database2.couch",
                            [],[],nil,
                            {user_ctx,null,[],undefined},
                            nil,1000,
                            [before_header,after_header,on_file_open],
                            false}
** Reason for termination ==
** {timeout,
       {gen_server,call,
           [<0.177.0>,
            {db_updated,

{db,<0.177.0>,<0.178.0>,nil,<<"1323371411352029">>,<0.175.0>,
                    <0.179.0>,
                    {db_header,5,13636863,0,
                        {6776455960,{13636861,0}},
                        {6776479023,13636861},
                        {1039786,[]},
                        0,nil,nil,1000},
                    13636863,
                    {btree,<0.175.0>,
                        {6776557909,{13637061,0}},
                        #Fun<couch_db_updater.10.19222179>,
                        #Fun<couch_db_updater.11.21515767>,
                        #Fun<couch_btree.5.112258129>,
                        #Fun<couch_db_updater.12.93888648>},
                    {btree,<0.175.0>,
                        {6776580448,13637061},
                        #Fun<couch_db_updater.13.40165027>,
                        #Fun<couch_db_updater.14.82810239>,
                        #Fun<couch_btree.5.112258129>,
                        #Fun<couch_db_updater.15.104121193>},
                    {btree,<0.175.0>,
                        {1039786,[]},
                        #Fun<couch_btree.0.23070627>,
                        #Fun<couch_btree.1.117278773>,
                        #Fun<couch_btree.2.112258129>,nil},
                    13637063,
                    <<"database2">>,
                    "/data/couchdb/data/database2.couch",
                    [],[],nil,
                    {user_ctx,null,[],undefined},
                    #Ref<0.0.30.133811>,1000,
                    [before_header,after_header,on_file_open],
                    false}}]}}

[Thu, 08 Dec 2011 20:17:22 GMT] [error] [<0.178.0>] {error_report,<0.31.0>,
                     {<0.178.0>,crash_report,

[[{initial_call,{couch_db_updater,init,['Argument__1']}},
                        {pid,<0.178.0>},
                        {registered_name,[]},
                        {error_info,
                         {exit,
                          {timeout,
                           {gen_server,call,
                            [<0.177.0>,
                             {db_updated,
                              {db,<0.177.0>,<0.178.0>,nil,
                               <<"1323371411352029">>,<0.175.0>,<0.179.0>,
                               {db_header,5,13636863,0,
                                {6776455960,{13636861,0}},
                                {6776479023,13636861},
                                {1039786,[]},
                                0,nil,nil,1000},
                               13636863,
                               {btree,<0.175.0>,
                                {6776557909,{13637061,0}},
                                #Fun<couch_db_updater.10.19222179>,
                                #Fun<couch_db_updater.11.21515767>,
                                #Fun<couch_btree.5.112258129>,
                                #Fun<couch_db_updater.12.93888648>},
                               {btree,<0.175.0>,
                                {6776580448,13637061},
                                #Fun<couch_db_updater.13.40165027>,
                                #Fun<couch_db_updater.14.82810239>,
                                #Fun<couch_btree.5.112258129>,
                                #Fun<couch_db_updater.15.104121193>},
                               {btree,<0.175.0>,
                                {1039786,[]},
                                #Fun<couch_btree.0.23070627>,
                                #Fun<couch_btree.1.117278773>,
                                #Fun<couch_btree.2.112258129>,nil},
                               13637063,
                               <<"database2">>,
                               "/data/couchdb/data/database2.couch",
                               [],[],nil,
                               {user_ctx,null,[],undefined},
                               #Ref<0.0.30.133811>,1000,
                               [before_header,after_header,on_file_open],
                               false}}]}},
                          [{gen_server,terminate,6},
                           {proc_lib,init_p_do_apply,3}]}},
                        {ancestors,[<0.177.0>,<0.174.0>]},
                        {messages,
                         [{'EXIT',<0.177.0>,shutdown},delayed_commit]},
                        {links,[]},
                        {dictionary,[]},
                        {trap_exit,true},
                        {status,running},
                        {heap_size,121393},
                        {stack_size,24},
                        {reductions,83311172}],
                       []]}}

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message