couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dave Cottlehuber (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (COUCHDB-1346) CouchDB hangs during start of view indexing
Date Fri, 07 Dec 2012 01:11:23 GMT

    [ https://issues.apache.org/jira/browse/COUCHDB-1346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13526062#comment-13526062
] 

Dave Cottlehuber commented on COUCHDB-1346:
-------------------------------------------

I've done a fair few runs now and we are blocking on return from fputc every time. Internally
the NT API looks like it is unable to flush data to the pipe, and that's why we end up blocking.
Not really sure how to peer inside this further. 

couch_config:set("query_server_config", "os_process_limit", "1").
and this patch: https://www.friendpaste.com/5vB0pIKtXXRM0eij296ibX which includes sending
\r\n, logging all port traffic as LOG_INFO, and reduced buffer size.

Reporting back;
- I switched buffer size to 1024 no difference. Note that 4095 was already set in 1.2.0 https://github.com/apache/couchdb/commit/2a2f488b002f379f23d9ec9f64ed4170178b7227.
- ditto for reducing it down to 64 even. this time I can get the erlang shell running too,
and the port & gen_server wrapper info follow:

https://www.friendpaste.com/2KfwQCQ8kFhqY3ni1hqVf2

<lots_of_data_above>

[info] [<0.4581.0>] OS Process #Port<0.103763> Output :: ["chunks",["\u000a<li>Key:
9 Value: 9 LineNo: 10</li>"]]
[info] [<0.4581.0>] OS Process #Port<0.103763> Input  :: ["list_end"]
[info] [<0.4581.0>] OS Process #Port<0.103763> Output :: ["end",["</ul><p>FirstKey:
1 LastKey: 9</p>"]]
[info] [<0.4581.0>] OS Process #Port<0.103763> Input  :: ["reset",{"reduce_limit":true,"timeout":5000}]
[info] [<0.4581.0>] OS Process #Port<0.103763> Output :: true

1> 
=INFO REPORT==== 7-Dec-2012::01:35:38 ===
    alarm_handler: {clear,system_memory_high_watermark}

1> erlang:process_info(pid(0,4581,0)).
[{current_function,{gen_server,loop,6}},
 {initial_call,{proc_lib,init_p,5}},
 {status,waiting},
 {message_queue_len,0},
 {messages,[]},
 {links,[#Port<0.103763>,<0.4548.0>]},
 {dictionary,[{'$ancestors',[couch_query_servers,
                             couch_secondary_services,couch_server_sup,<0.4441.0>]},
              {'$initial_call',{couch_os_process,init,1}}]},
 {trap_exit,false},
 {error_handler,error_handler},
 {priority,normal},
 {group_leader,<0.4440.0>},
 {total_heap_size,17711},
 {heap_size,6765},
 {stack_size,9},
 {reductions,160580},
 {garbage_collection,[{min_bin_vheap_size,46368},
                      {min_heap_size,233},
                      {fullsweep_after,65535},
                      {minor_gcs,73}]},
 {suspending,[]}]
2> 
=INFO REPORT==== 7-Dec-2012::01:36:38 ===
    alarm_handler: {set,{system_memory_high_watermark,[]}}

### couchjs.c  ### 

same place again:

>~*k
Callstack for Thread 1 (Thread Id: 716 (0x2cc)):
 Index  Function
--------------------------------------------------------------------------------
 1      ntdll.dll!_NtReadFile@36()
 2      KernelBase.dll!_ReadFile@20()
 3      msvcr100.dll!__read_nolock()
 4      msvcr100.dll!__read()
 5      msvcr100.dll!__filbuf()
 6      msvcr100.dll!_getc()
*7      couchjs.exe!couch_readline(JSContext * cx=0x00b482c0, _iobuf * fp=0x73a33008)
 8      couchjs.exe!readline(JSContext * cx=0x00b482c0, unsigned int argc=0, unsigned __int64
* vp=0x02690118)
 9      mozjs185-1.0.dll!6fa5f09d()
 10     [Frames below may be incorrect and/or missing, no symbols loaded for mozjs185-1.0.dll]
 11     mozjs185-1.0.dll!JS_CompareValues() + 7967 bytes
 12     mozjs185-1.0.dll!JS_GetScopeChain() + 5110 bytes
 13     mozjs185-1.0.dll!JS_ExecuteScript() + 34 bytes
 14     couchjs.exe!main(int argc=2, const char * * argv=0x02681948)
 15     couchjs.exe!__tmainCRTStartup()
 16     kernel32.dll!@BaseThreadInitThunk@12()
 17     ntdll.dll!___RtlUserThreadStart@8()
 18     ntdll.dll!__RtlUserThreadStart@8()



The Erlang Port is still up & running:

2> erlang:ports().
[#Port<0.98509>,#Port<0.98518>,#Port<0.103763>,
 #Port<0.103903>,#Port<0.103922>,#Port<0.103924>,
 #Port<0.103926>,#Port<0.100936>,#Port<0.102133>,
 #Port<0.102135>,#Port<0.100182>,#Port<0.101219>,
 #Port<0.98172>,#Port<0.98217>,#Port<0.101331>,
 #Port<0.102391>]
3> [ _,_, Port | _] = erlang:ports(). 
[#Port<0.98509>,#Port<0.98518>,#Port<0.103763>,
 #Port<0.103903>,#Port<0.103922>,#Port<0.103924>,
 #Port<0.103926>,#Port<0.100936>,#Port<0.102133>,
 #Port<0.102135>,#Port<0.100182>,#Port<0.101219>,
 #Port<0.98172>,#Port<0.98217>,#Port<0.101331>,
 #Port<0.102391>]
4> Port.
#Port<0.103763>
5> erlang:port
port_call/2      port_call/3      port_close/1     port_command/2   
port_command/3   port_connect/2   port_control/3   port_get_data/1  
port_info/1      port_info/2      port_set_data/2  port_to_list/1   
ports/0          
5> erlang:port_info(Port).
[{name,"c:/werl/OTP_SR~1/release/win32/lib/couch-1.3.0a-e64bbec-git/priv/couchspawnkillable
./couchjs.exe ../share/couchdb/server/main.js"},
 {links,[<0.4581.0>]},
 {id,103763},
 {connected,<0.4581.0>},
 {input,72311},
 {output,77607},
 {os_pid,2364}]
6> 

The data still in the buffer is 4106 bytes:

y;\n          send('\\n<li>Key: '+row.key\n          +' Value: '+row.value\n       
  +' LineNo: '+row_number+'</li>');\n        }\n        return '</ul><p>FirstKey:
'+ firstKey + ' LastKey: '+ prevKey+'</p>';\n      })","acceptSwitch":"(function (head,
req) {\n        // respondWith takes care of setting the proper headers\n        provides(\"html\",
function() {\n          send(\"HTML <ul>\");\n\n          var row, num = 0;\n      
   while (row = getRow()) {\n            num ++;\n            send('\\n<li>Key: '\n
             +row.key+' Value: '+row.value\n              +' LineNo: '+num+'</li>');\n
         }\n\n          // tail\n          return '</ul>';\n        });\n\n        provides(\"xml\",
function() {\n          send('<feed xmlns=\"http://www.w3.org/2005/Atom\">'\n      
     +'<title>Test XML Feed</title>');\n\n          while (row = getRow()) {\n
           var entry = new XML('<entry/>');\n            entry.id = row.id;\n      
     entry.title = row.key;\n            entry.content = row.value;\n            send(entry);\n
         }\n          return \"</feed>\";\n        });\n      })","qsParams":"(function
(head, req) {\n        return toJSON(req.query) + \"\\n\";\n      })","stopIter":"(function
(req) {\n        send(\"head\");\n        var row, row_number = 0;\n        while(row = getRow())
{\n          if(row_number > 2) break;\n          send(\" \" + row_number);\n         
row_number += 1;\n        };\n        return \" tail\";\n      })","stopIter2":"(function
(head, req) {\n        provides(\"html\", function() {\n          send(\"head\");\n      
   var row, row_number = 0;\n          while(row = getRow()) {\n            if(row_number
> 2) break;\n            send(\" \" + row_number);\n            row_number += 1;\n    
     };\n          return \" tail\";\n        });\n      })","tooManyGetRows":"(function ()
{\n        send(\"head\");\n        var row;\n        while(row = getRow()) {\n          send(row.key);\n
       };\n        getRow();\n        getRow();\n        getRow();\n        row = getRow();\n
       return \"after row: \"+toJSON(row);\n      })","emptyList":"(function () {\n      
 return \" \";\n      })","rowError":"(function (head, req) {\n        send(\"head\");\n 
      var row = getRow();\n        send(fooBarBam); // intentional error\n        return \"tail\";\n
     })","docReference":"(function (head, req) {\n        send(\"head\");\n        var row
= getRow();\n        send(row.doc.integer);\n        return \"tail\";\n      })","secObj":"(function
(head, req) {\n        return toJSON(req.secObj);\n      })","setHeaderAfterGotRow":"(function
(head, req) {\n        getRow();\ðñÿÿ


The last few characters are weird; in hex that is the next 4k buffer beginning: C3 B0 C3 B1
C3 BF C3 BF 01 0A.


### tomorrow ###

try and remove the couch* code from this and just make a minimal erlang + port wrapper around
couchjs. 

ideas welcomed!

                
> CouchDB hangs during start of view indexing
> -------------------------------------------
>
>                 Key: COUCHDB-1346
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1346
>             Project: CouchDB
>          Issue Type: Bug
>          Components: View Server Support
>    Affects Versions: 1.3
>         Environment: Windows 7 Enterprise only, not able to replicate on Mac OS X.
> Erlang R14B03 + crypto patches.
> Mozilla Javascript 1.8.5
>            Reporter: Dave Cottlehuber
>            Assignee: Adam Kocoloski
>            Priority: Blocker
>              Labels: Windows
>             Fix For: 1.3
>
>
> [info] [<0.20499.0>] Opening index for db: test_suite_db idx: f4421bf4e9c9bf2acb3db91bca9e9adc
sig: "d5c87ad33242b181f86be2139cbccd96"
> [info] [<0.20504.0>] Starting index update for db: test_suite_db idx: f4421bf4e9c9bf2acb3db91bca9e9adc
> [info] [<0.20334.0>] 172.16.40.1 - - POST /test_suite_db/_temp_view 500
> [info] [<0.20513.0>] 172.16.40.1 - - GET /_utils/couch_tests.html?script/couch_tests.js
200
> [info] [<0.20514.0>] 172.16.40.1 - - GET /_utils/index.html 200
> [info] [<0.20060.0>] 172.16.40.1 - - DELETE /test_suite_db_a/ 200
> [info] [<0.20407.0>] 172.16.40.1 - - GET /test_suite_reports/ 404
> [info] [<0.20058.0>] 172.16.40.1 - - DELETE /test_suite_db/ 404
> [info] [<0.20071.0>] 172.16.40.1 - - DELETE /test_suite_db/ 404
> [info] [<0.20069.0>] 172.16.40.1 - - DELETE /test_suite_db/ 404
> [info] [<0.20484.0>] 172.16.40.1 - - DELETE /test_suite_db/ 404
> [info] [<0.20364.0>] 172.16.40.1 - - DELETE /test_suite_db/ 404
> [info] [<0.20062.0>] 172.16.40.1 - - DELETE /test_suite_db/ 404
> [info] [<0.20388.0>] 172.16.40.1 - - DELETE /test_suite_db/ 404
> [info] [<0.20345.0>] 172.16.40.1 - - DELETE /test_suite_db/ 404
> [info] [<0.20072.0>] 172.16.40.1 - - DELETE /test_suite_db/ 404
> [info] [<0.20059.0>] 172.16.40.1 - - DELETE /test_suite_db/ 404
> [info] [<0.20061.0>] 172.16.40.1 - - DELETE /test_suite_db/ 404
> [info] [<0.20472.0>] 172.16.40.1 - - DELETE /test_suite_db/ 200
> [error] [<0.20050.0>] ** Generic server couch_index_server terminating 
> ** Last message in was {'$gen_cast',{reset_indexes,<<"test_suite_db">>}}
> ** When Server state == {st,"../var/lib/couchdb"}
> ** Reason for termination == 
> ** {{case_clause,{error,eacces}},
>     [{couch_file,'-nuke_dir/2-fun-0-',3},
>      {lists,foreach,2},
>      {couch_file,nuke_dir,2},
>      {couch_index_server,handle_cast,2},
>      {gen_server,handle_msg,5},
>      {proc_lib,init_p_do_apply,3}]}
> =ERROR REPORT==== 23-Nov-2011::21:17:14 ===
> ** Generic server couch_index_server terminating 
> ** Last message in was {'$gen_cast',{reset_indexes,<<"test_suite_db">>}}
> ** When Server state == {st,"../var/lib/couchdb"}
> ** Reason for termination == 
> ** {{case_clause,{error,eacces}},
>     [{couch_file,'-nuke_dir/2-fun-0-',3},
>      {lists,foreach,2},
>      {couch_file,nuke_dir,2},
>      {couch_index_server,handle_cast,2},
>      {gen_server,handle_msg,5},
>      {proc_lib,init_p_do_apply,3}]}
> [error] [<0.20050.0>] {error_report,<0.19957.0>,
>                           {<0.20050.0>,crash_report,
>                            [[{initial_call,
>                                  {couch_index_server,init,['Argument__1']}},
>                              {pid,<0.20050.0>},
>                              {registered_name,couch_index_server},
>                              {error_info,
>                                  {exit,
>                                      {{case_clause,{error,eacces}},
>                                       [{couch_file,'-nuke_dir/2-fun-0-',3},
>                                        {lists,foreach,2},
>                                        {couch_file,nuke_dir,2},
>                                        {couch_index_server,handle_cast,2},
>                                        {gen_server,handle_msg,5},
>                                        {proc_lib,init_p_do_apply,3}]},
>                                      [{gen_server,terminate,6},
>                                       {proc_lib,init_p_do_apply,3}]}},
>                              {ancestors,
>                                  [couch_secondary_services,couch_server_sup,
>                                   <0.19958.0>]},
>                              {messages,
>                                  [{'$gen_cast',
>                                       {reset_indexes,<<"test_suite_db_a">>}}]},
>                              {links,[<0.20051.0>,<0.20026.0>]},
>                              {dictionary,[]},
>                              {trap_exit,true},
>                              {status,running},
>                              {heap_size,1597},
>                              {stack_size,24},
>                              {reductions,12211}],
>                             [{neighbour,
>                                  [{pid,<0.20051.0>},
>                                   {registered_name,[]},
>                                   {initial_call,
>                                       {couch_event_sup,init,['Argument__1']}},
>                                   {current_function,{gen_server,loop,6}},
>                                   {ancestors,
>                                       [couch_index_server,
>                                        couch_secondary_services,
>                                        couch_server_sup,<0.19958.0>]},
>                                   {messages,[]},
>                                   {links,[<0.20050.0>,<0.20018.0>]},
>                                   {dictionary,[]},
>                                   {trap_exit,false},
>                                   {status,waiting},
>                                   {heap_size,233},
>                                   {stack_size,9},
>                                   {reductions,32}]}]]}}
> =CRASH REPORT==== 23-Nov-2011::21:17:14 ===
>   crasher:
>     initial call: couch_index_server:init/1
>     pid: <0.20050.0>
>     registered_name: couch_index_server
>     exception exit: {{case_clause,{error,eacces}},
>                      [{couch_file,'-nuke_dir/2-fun-0-',3},
>                       {lists,foreach,2},
>                       {couch_file,nuke_dir,2},
>                       {couch_index_server,handle_cast,2},
>                       {gen_server,handle_msg,5},
>                       {proc_lib,init_p_do_apply,3}]}
>       in function  gen_server:terminate/6
>     ancestors: [couch_secondary_services,couch_server_sup,<0.19958.0>]
>     messages: [{'$gen_cast',{reset_indexes,<<"test_suite_db_a">>}}]
>     links: [<0.20051.0>,<0.20026.0>]
>     dictionary: []
>     trap_exit: true
>     status: running
>     heap_size: 1597
>     stack_size: 24
>     reductions: 12211
>   neighbours:
>     neighbour: [{pid,<0.20051.0>},
>                   {registered_name,[]},
>                   {initial_call,{couch_event_sup,init,['Argument__1']}},
>                   {current_function,{gen_server,loop,6}},
>                   {ancestors,[couch_index_server,couch_secondary_services,
>                               couch_server_sup,<0.19958.0>]},
>                   {messages,[]},
>                   {links,[<0.20050.0>,<0.20018.0>]},
>                   {dictionary,[]},
>                   {trap_exit,false},
>                   {status,waiting},
>                   {heap_size,233},
>                   {stack_size,9},
>                   {reductions,32}]
> [error] [<0.20026.0>] {error_report,<0.19957.0>,
>                           {<0.20026.0>,supervisor_report,
>                            [{supervisor,{local,couch_secondary_services}},
>                             {errorContext,child_terminated},
>                             {reason,
>                                 {{case_clause,{error,eacces}},
>                                  [{couch_file,'-nuke_dir/2-fun-0-',3},
>                                   {lists,foreach,2},
>                                   {couch_file,nuke_dir,2},
>                                   {couch_index_server,handle_cast,2},
>                                   {gen_server,handle_msg,5},
>                                   {proc_lib,init_p_do_apply,3}]}},
>                             {offender,
>                                 [{pid,<0.20050.0>},
>                                  {name,index_server},
>                                  {mfargs,{couch_index_server,start_link,[]}},
>                                  {restart_type,permanent},
>                                  {shutdown,brutal_kill},
>                                  {child_type,worker}]}]}}
> OS process tree at this time is:
> Process information for SENDAI:
> Name                             Pid Pri Thd  Hnd      VM      WS    Priv
> Idle                               0   0   2    0       0      24       0
>   System                           4   8  79  477    3380     304     108
> explorer                        1984   8  21  664  213732   46340   21540
>   cmd                           2104   8   1   25   48132    3304    2144
>     pslist                      2776  13   1  133   63584    4976    2000
>   cmd                           2504   8   1   26   44980    3512    3012
>     werl                        2680   8  16  390  196232   40064   28628
>       win32sysinfo              1152   8   1   21   12624    2124     640
>       couchspawnkillable        1444   8   1   30   12992    2284     688
>         couchjs                 1468   8   1   39   55900    6572    4056
>       couchspawnkillable        2740   8   1   30   12992    2280     684
>         couchjs                 2756   8   1   39   55900    7108    4444
> Erlang resumes running CouchDB when couchjs procs are terminated with extreme
> prejudice. The hang still occurs after reverting fdmanana's COUCHDB-1334
> commit. This could be a race condition during invalidation of the views, and
> subsequent deletion of the related ddoc view directory prior to reindexing.
> On Windows a filesystem object cannot be deleted if there are open handles
> remaining.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message