incubator-couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Newson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (COUCHDB-1212) Newly created user accounts cannot sign-in after _user database crashes
Date Sun, 03 Jul 2011 18:41:22 GMT

    [ https://issues.apache.org/jira/browse/COUCHDB-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13059255#comment-13059255
] 

Robert Newson commented on COUCHDB-1212:
----------------------------------------

Jan,

Don't worry about it. I'm not aware of the degree to which the CouchDB source code included
in CouchBase Server varies from stock, though I believe it's very little (perhaps not at all?).

As to the proposed patch, Filipe is suggesting that your system is slowed significantly enough
during compaction (it must read 4,5 GB of data, and not always sequentially, in addition to
writing all live data to a new file) that you hit timeouts where we've never anticipated them.
gen_server calls default to a 5 second timeout so it seems plausible that your GET to _users
took longer. We have a history of extending the timeout on any gen_server:call to infinity
if it has to consult the disk as we cannot predict how slow a disk will respond.

I'm confident that the bug you report is real and present in core CouchDB (and therefore in
CouchBase Server, which has a small or zero delta from CouchDB). It may or may not be present
in other products like BigCouch which have a larger delta, which is why it's important to
file tickets appropriately. Anyone using CouchDB in their product will monitor this issue
tracker in addition to monitoring the repository itself for fixes.

 

> Newly created user accounts cannot sign-in after _user database crashes 
> ------------------------------------------------------------------------
>
>                 Key: COUCHDB-1212
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1212
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Database Core, HTTP Interface
>    Affects Versions: 1.0.2
>         Environment: Ubuntu 10.10, Erlang R14B02 (erts-5.8.3)
>            Reporter: Jan van den Berg
>            Priority: Critical
>              Labels: _users, authentication
>         Attachments: couchdb-1212.patch
>
>
> We have one (4,5 GB) couch database and we use the (default) _users database to store
user accounts for a website. Once a week we need to restart couchdb because newly sign-up
user accounts cannot login any more. They get a HTTP statuscode 401 from the _session HTTP
interface. We update, and compact the database three times a day.
> This is the a stacktrace I see in the couch database log prior to when these issues occur.
> ----------- couch.log ---------------
> [Wed, 29 Jun 2011 22:02:46 GMT] [info] [<0.117.0>] Starting compaction for db "fbm"
> [Wed, 29 Jun 2011 22:02:46 GMT] [info] [<0.5753.79>] 127.0.0.1 - - 'POST' /fbm/_compact
202
> [Wed, 29 Jun 2011 22:02:46 GMT] [info] [<0.5770.79>] 127.0.0.1 - - 'POST' /fbm/_view_cleanup
202
> [Wed, 29 Jun 2011 22:10:19 GMT] [info] [<0.5773.79>] 86.9.246.184 - - 'GET' /_session
200
> [Wed, 29 Jun 2011 22:24:39 GMT] [info] [<0.6236.79>] 85.28.105.161 - - 'GET' /_session
200
> [Wed, 29 Jun 2011 22:25:06 GMT] [error] [<0.84.0>] ** Generic server couch_server
terminating 
> ** Last message in was {open,<<"fbm">>,
>                              [{user_ctx,{user_ctx,null,[],undefined}}]}
> ** When Server state == {server,"/opt/couchbase-server/var/lib/couchdb",
>                             {re_pattern,0,0,
>                                 <<69,82,67,80,116,0,0,0,16,0,0,0,1,0,0,0,0,0,
>                                   0,0,0,0,0,0,40,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
>                                   0,93,0,72,25,77,0,0,0,0,0,0,0,0,0,0,0,0,254,
>                                   255,255,7,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
>                                   77,0,0,0,0,16,171,255,3,0,0,0,128,254,255,
>                                   255,7,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,69,26,
>                                   84,0,72,0>>},
>                             100,2,"Sat, 18 Jun 2011 14:00:44 GMT"}
> ** Reason for termination == 
> ** {timeout,{gen_server,call,[<0.116.0>,{open_ref_count,<0.10417.79>}]}}
> [Wed, 29 Jun 2011 22:25:06 GMT] [error] [<0.84.0>] {error_report,<0.31.0>,
>     {<0.84.0>,crash_report,
>      [[{initial_call,{couch_server,init,['Argument__1']}},
>        {pid,<0.84.0>},
>        {registered_name,couch_server},
>        {error_info,
>            {exit,
>                {timeout,
>                    {gen_server,call,
>                        [<0.116.0>,{open_ref_count,<0.10417.79>}]}},
>                [{gen_server,terminate,6},{proc_lib,init_p_do_apply,3}]}},
>        {ancestors,[couch_primary_services,couch_server_sup,<0.32.0>]},
>        {messages,[]},
>        {links,[<0.91.0>,<0.483.0>,<0.116.0>,<0.79.0>]},
>        {dictionary,[]},
>        {trap_exit,true},
>        {status,running},
>        {heap_size,6765},
>        {stack_size,24},
>        {reductions,206710598}],
>       []]}}
> [Wed, 29 Jun 2011 22:25:06 GMT] [error] [<0.79.0>] {error_report,<0.31.0>,
>     {<0.79.0>,supervisor_report,
>      [{supervisor,{local,couch_primary_services}},
>       {errorContext,child_terminated},
>       {reason,
>           {timeout,
>               {gen_server,call,[<0.116.0>,{open_ref_count,<0.10417.79>}]}}},
>       {offender,
>           [{pid,<0.84.0>},
>            {name,couch_server},
>            {mfargs,{couch_server,sup_start_link,[]}},
>            {restart_type,permanent},
>            {shutdown,1000},
>            {child_type,worker}]}]}}
> [Wed, 29 Jun 2011 22:25:06 GMT] [error] [<0.91.0>] ** Generic server <0.91.0>
terminating 
> ** Last message in was {'EXIT',<0.84.0>,
>                            {timeout,
>                                {gen_server,call,
>                                    [<0.116.0>,
>                                     {open_ref_count,<0.10417.79>}]}}}
> ** When Server state == {db,<0.91.0>,<0.92.0>,nil,<<"1308405644393791">>,
>                             <0.90.0>,<0.94.0>,
>                             {db_header,5,91,0,
>                                 {378285,{30,9}},
>                                 {380466,39},
>                                 nil,0,nil,nil,1000},
>                             91,
>                             {btree,<0.90.0>,
>                                 {378285,{30,9}},
>                                 #Fun<couch_db_updater.7.10053969>,
>                                 #Fun<couch_db_updater.8.35220795>,
>                                 #Fun<couch_btree.5.124754102>,
>                                 #Fun<couch_db_updater.9.107593676>},
>                             {btree,<0.90.0>,
>                                 {380466,39},
>                                 #Fun<couch_db_updater.10.30996817>,
>                                 #Fun<couch_db_updater.11.96515267>,
>                                 #Fun<couch_btree.5.124754102>,
>                                 #Fun<couch_db_updater.12.117826253>},
>                             {btree,<0.90.0>,nil,#Fun<couch_btree.0.83553141>,
>                                 #Fun<couch_btree.1.30790806>,
>                                 #Fun<couch_btree.2.124754102>,nil},
>                             91,<<"_users">>,
>                             "/opt/couchbase-server/var/lib/couchdb/_users.couch",
>                             [#Fun<couch_doc.7.50754398>],
>                             [],nil,
>                             {user_ctx,null,[],undefined},
>                             nil,1000,
>                             [before_header,after_header,on_file_open],
>                             true}
> ** Reason for termination == 
> ** {timeout,{gen_server,call,[<0.116.0>,{open_ref_count,<0.10417.79>}]}}
> [Wed, 29 Jun 2011 22:25:06 GMT] [error] [<0.91.0>] {error_report,<0.31.0>,
>     {<0.91.0>,crash_report,
>      [[{initial_call,{couch_db,init,['Argument__1']}},
>        {pid,<0.91.0>},
>        {registered_name,[]},
>        {error_info,
>            {exit,
>                {timeout,
>                    {gen_server,call,
>                        [<0.116.0>,{open_ref_count,<0.10417.79>}]}},
>                [{gen_server,terminate,6},{proc_lib,init_p_do_apply,3}]}},
>        {ancestors,[<0.89.0>]},
>        {messages,[]},
>        {links,[]},
>        {dictionary,[]},
>        {trap_exit,true},
>        {status,running},
>        {heap_size,610},
>        {stack_size,24},
>        {reductions,8797798}],
>       []]}}
> [Wed, 29 Jun 2011 22:25:06 GMT] [info] [<0.300.0>] Shutting down view group server,
monitored db is closing.
> [Wed, 29 Jun 2011 22:25:06 GMT] [error] [<0.10417.79>] Uncaught error in HTTP request:
{exit,
>                                  {{timeout,
>                                    {gen_server,call,
>                                     [<0.116.0>,
>                                      {open_ref_count,<0.10417.79>}]}},
>                                   {gen_server,call,
>                                    [couch_server,
>                                     {open,<<"fbm">>,
>                                      [{user_ctx,
>                                        {user_ctx,null,[],undefined}}]},
>                                     infinity]}}}
> [Wed, 29 Jun 2011 22:25:06 GMT] [error] [<0.483.0>] ** Generic server <0.483.0>
terminating 
> ** Last message in was {'EXIT',<0.84.0>,
>                            {timeout,
>                                {gen_server,call,
>                                    [<0.116.0>,
>                                     {open_ref_count,<0.10417.79>}]}}}
> ** When Server state == {db,<0.483.0>,<0.484.0>,nil,<<"1308405937993370">>,
>                             <0.4643.19>,<0.4645.19>,
>                             {db_header,5,890453,0,
>                                 {3279126950,{752003,0}},
>                                 {3279118313,752003},
>                                 {3279132318,[]},
>                                 0,nil,3279127184,1000},
>                             890453,
>                             {btree,<0.4643.19>,
>                                 {3279126950,{752003,0}},
>                                 #Fun<couch_db_updater.7.10053969>,
>                                 #Fun<couch_db_updater.8.35220795>,
>                                 #Fun<couch_btree.5.124754102>,
>                                 #Fun<couch_db_updater.9.107593676>},
>                             {btree,<0.4643.19>,
>                                 {3279118313,752003},
>                                 #Fun<couch_db_updater.10.30996817>,
>                                 #Fun<couch_db_updater.11.96515267>,
>                                 #Fun<couch_btree.5.124754102>,
>                                 #Fun<couch_db_updater.12.117826253>},
>                             {btree,<0.4643.19>,
>                                 {3279132318,[]},
>                                 #Fun<couch_btree.0.83553141>,
>                                 #Fun<couch_btree.1.30790806>,
>                                 #Fun<couch_btree.2.124754102>,nil},
>                             890453,<<"fbm_full">>,
>                             "/opt/couchbase-server/var/lib/couchdb/fbm_full.couch",
>                             [#Fun<couch_doc.7.50754398>],
>                             [{<<"admins">>,
>                               {[{<<"names">>,[]},
>                                 {<<"roles">>,[<<"import">>]}]}},
>                              {<<"readers">>,
>                               {[{<<"names">>,[]},{<<"roles">>,[]}]}}],
>                             3279127184,
>                             {user_ctx,null,[],undefined},
>                             nil,1000,
>                             [before_header,after_header,on_file_open],
>                             false}
> ** Reason for termination == 
> ** {timeout,{gen_server,call,[<0.116.0>,{open_ref_count,<0.10417.79>}]}}
> [Wed, 29 Jun 2011 22:25:06 GMT] [error] [<0.483.0>] {error_report,<0.31.0>,
>     {<0.483.0>,crash_report,
>      [[{initial_call,{couch_db,init,['Argument__1']}},
>        {pid,<0.483.0>},
>        {registered_name,[]},
>        {error_info,
>            {exit,
>                {timeout,
>                    {gen_server,call,
>                        [<0.116.0>,{open_ref_count,<0.10417.79>}]}},
>                [{gen_server,terminate,6},{proc_lib,init_p_do_apply,3}]}},
>        {ancestors,[<0.480.0>]},
>        {messages,[]},
>        {links,[]},
>        {dictionary,[]},
>        {trap_exit,true},
>        {status,running},
>        {heap_size,6765},
>        {stack_size,24},
>        {reductions,1389}],
>       []]}}
> [Wed, 29 Jun 2011 22:25:06 GMT] [info] [<0.2984.19>] Shutting down view group server,
monitored db is closing.
> [Wed, 29 Jun 2011 22:25:06 GMT] [info] [<0.10417.79>] Stacktrace: [{gen_server,call,3},
>              {couch_server,open,2},
>              {couch_db,open,2},
>              {couch_httpd_db,do_db_req,2},
>              {couch_httpd,handle_request_int,5},
>              {mochiweb_http,headers,5},
>              {proc_lib,init_p_do_apply,3}]
> ------- end --------
> Here's the log file of me signing in as an admin, creating a new user, and trying to
sign-in as the newly created user. 
> ------ couch.log -------
> [Fri, 01 Jul 2011 18:37:16 GMT] [info] [<0.20439.91>] 93.92.103.118 - - 'POST'
/_session 200
> [Fri, 01 Jul 2011 18:37:16 GMT] [info] [<0.20457.91>] checkpointing view update
at seq 91 for _users _design/_auth
> [Fri, 01 Jul 2011 18:37:16 GMT] [info] [<0.20439.91>] 93.92.103.118 - - 'GET' /_users/_design/_auth/_list/secure/users
200
> [Fri, 01 Jul 2011 18:38:35 GMT] [info] [<0.20456.91>] 93.92.103.118 - - 'PUT' /_users/org.couchdb.user:example@mail.com
201
> [Fri, 01 Jul 2011 18:38:35 GMT] [info] [<0.20457.91>] checkpointing view update
at seq 92 for _users _design/_auth
> [Fri, 01 Jul 2011 18:38:35 GMT] [info] [<0.20456.91>] 93.92.103.118 - - 'GET' /_users/_design/_auth/_list/secure/users
200
> [Fri, 01 Jul 2011 18:38:47 GMT] [info] [<0.20456.91>] 93.92.103.118 - - 'GET' /_users/_design/_auth/_list/secure/users?key=%22org.couchdb.user:example@mail.com%22
200
> [Fri, 01 Jul 2011 18:38:47 GMT] [info] [<0.20456.91>] 93.92.103.118 - - 'PUT' /_users/org.couchdb.user:example@mail.com
201
> [Fri, 01 Jul 2011 18:39:00 GMT] [info] [<0.20547.91>] 93.92.103.118 - - 'GET' /_session
200
> [Fri, 01 Jul 2011 18:39:01 GMT] [info] [<0.20547.91>] 93.92.103.118 - - 'GET' /fbm/_design/api/_list/secure/competitions
200
> [Fri, 01 Jul 2011 18:39:12 GMT] [info] [<0.20547.91>] 93.92.103.118 - - 'POST'
/_session 401
> [Fri, 01 Jul 2011 18:39:22 GMT] [info] [<0.20547.91>] 93.92.103.118 - - 'POST'
/_session 401
> ------- end ---------
>  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message