couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Simon Eisenmann (JIRA)" <j...@apache.org>
Subject [jira] [Reopened] (COUCHDB-536) CouchDB HTTP server stops accepting connections
Date Fri, 08 Jul 2011 07:43:16 GMT

     [ https://issues.apache.org/jira/browse/COUCHDB-536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Simon Eisenmann reopened COUCHDB-536:
-------------------------------------

    Skill Level: Committers Level (Medium to Hard)

All right i got this issue again on one of the nodes in the cluster. The software is now CouchDB
1.1.0 with Erlang R14B02. 

After a couple of hours replicating from 3 other nodes and constant changes on the local node
it stopps accepting HTTP (see error below).

I have checked with netstat and also saw lots of connections using the CouchDB port. 

It only happens on one node on the cluster though. I keep monitoring if that happens every
day. I had a similar issue (replication did hang at some point) but thought this to be related
to stunnel as there was no trace in the couch. Yesterday i have switched to native CouchDB
SSL and now there is this trace.

[Fri, 08 Jul 2011 04:06:22 GMT] [error] [<0.10266.14>] {error_report,<0.31.0>,
                                     {<0.10266.14>,std_error,
                                      [{application,mochiweb},
                                       "Accept failed error",
                                       "{error,enfile}"]}}
[Fri, 08 Jul 2011 04:06:22 GMT] [error] [<0.10266.14>] {error_report,<0.31.0>,
                           {<0.10266.14>,crash_report,
                            [[{initial_call,
                                  {mochiweb_acceptor,init,
                                      ['Argument__1','Argument__2',
                                       'Argument__3']}},
                              {pid,<0.10266.14>},
                              {registered_name,[]},
                              {error_info,
                                  {exit,
                                      {error,accept_failed},
                                      [{mochiweb_acceptor,init,3},
                                       {proc_lib,init_p_do_apply,3}]}},
                              {ancestors,
                                  [https,couch_secondary_services,
                                   couch_server_sup,<0.32.0>]},
                              {messages,[]},
                              {links,[<0.136.0>]},
                              {dictionary,[]},
                              {trap_exit,false},
                              {status,running},
                              {heap_size,233},
                              {stack_size,24},
                              {reductions,372}],
                             []]}}


> CouchDB HTTP server stops accepting connections
> -----------------------------------------------
>
>                 Key: COUCHDB-536
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-536
>             Project: CouchDB
>          Issue Type: Bug
>          Components: HTTP Interface
>    Affects Versions: 0.10
>         Environment: Ubuntu Linux 8.04 32bit and 64bit with Erlang R13B01
>            Reporter: Simon Eisenmann
>            Priority: Critical
>
> Having 3 Couches all replicating a couple of databases to each other (pull replication
with a update notification process) the HTTP service on any of the Couches stops working at
some point (when running for a couple of ours with constant changes on all databases and servers).
> This is the error when a new HTTP request comes in:
> =ERROR REPORT==== 19-Oct-2009::10:18:55 ===
>     application: mochiweb
>     "Accept failed error"
>     "{error,enfile}"
> [error] [<0.21619.12>] {error_report,<0.24.0>,
>     {<0.21619.12>,crash_report,
>      [[{initial_call,{mochiweb_socket_server,acceptor_loop,['Argument__1']}},
>        {pid,<0.21619.12>},
>        {registered_name,[]},
>        {error_info,
>            {exit,
>                {error,accept_failed},
>                [{mochiweb_socket_server,acceptor_loop,1},
>                 {proc_lib,init_p_do_apply,3}]}},
>        {ancestors,
>            [couch_httpd,couch_secondary_services,couch_server_sup,<0.1.0>]},
>        {messages,[]},
>        {links,[<0.66.0>]},
>        {dictionary,[]},
>        {trap_exit,false},
>        {status,running},
>        {heap_size,233},
>        {stack_size,24},
>        {reductions,202}],
>       []]}}
> [error] [<0.66.0>] {error_report,<0.24.0>,
>     {<0.66.0>,std_error,
>      {mochiweb_socket_server,225,{acceptor_error,{error,accept_failed}}}}}
> To me this seems like it runs out of threads or sockets to handle the new connection
or somewhat like this.
> Also i see in this setup that if i put lots of changes in a short time at some point
the replication process hangs (never finishes) and when trying to restart the same replication
once again is not possible and resulting in a timeout.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message