tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Wertman <wertm...@comcast.net>
Subject Re: FYI: jk 1.2.20 core on iSeries v5R3
Date Fri, 02 Feb 2007 04:58:01 GMT
I did rebuild the sources as prefork,  it seems to have stopped the 
problem.  I did also make those 2 corrections you pointed out earlier in 
the thread.

Eric Wertman wrote:
> Connector:
>
> <Connector port="8010" protocol="AJP/1.3" enableLookups="false" 
> maxThreads="800" minThreads="100" bufferSize="8192" backlog="256" />
>
> I've tried all sorts of permutations of this... None of the values 
> make much difference.   My ab test is -n 1000 -c 200.
>
> ---------------------------------------
>
> workers.properties:
>
> # PlatformServer
> worker.PlatformServer.type=ajp13
> worker.PlatformServer.host=localhost
> worker.PlatformServer.port=8010
> worker.PlatformServer.connection_pool_timeout=60
> worker.PlatformServer.socket_timeout=60
> worker.PlatformServer.retries=5
>
> I've been through a lot of different settings here too.  I've tried 
> some almost insane values for maxThreads
> ---------------------------------------
>
> httpd.conf (Using virtual servers, forgot to mention that before)
>
> <IfModule jk_module>
> JkWorkersFile conf/workers.properties
> JkShmFile     logs/mod_jk.shm
> JkShmSize     8192
> JkLogFile     logs/mod_jk.log
> JkLogLevel    debug
> </IfModule>
>
> ---------------------------------------
>
> mod_jk startup at debug level.  I only included it because it shows 
> what mod_jk thinks of my connector.
>
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_worker.c (239): 
> creating worker PlatformServer
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_worker.c (144): 
> about to create instance PlatformServer of ajp13
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_worker.c (157): 
> about to validate and init PlatformServer
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (1971): worker PlatformServer contact is 'localhost:8010'
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (2098): setting endpoint options:
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (2101): keepalive:        0
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (2105): timeout:          60
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (2109): buffer size:      0
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (2113): pool timeout:     60
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (2117): connect timeout:  0
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (2121): reply timeout:    0
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (2125): prepost timeout:  0
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (2129): recovery options: 0
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (2133): retries:          5
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (2137): max packet size:  8192
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (2008): setting connection pool size to 25 with min 13
>
> ---------------------------------------
>
> Can't reproduce at debug level.  This is info (got it on the first try).
>
> [Wed Jan 31 18:31:10 2007] [2539586:0000] [error] jk_ajp_common.c 
> (1504): Unknown AJP protocol code: 41
> [Wed Jan 31 18:31:10 2007] [2539586:0000] [info]  jk_ajp_common.c 
> (1842): (PlatformServer) request failed, because of server error 
> without recovery in send loop attempt=0
> [Wed Jan 31 18:31:10 2007] [2539586:0000] [error] jk_ajp_common.c 
> (970): wrong message format 0x3837 from 127.0.0.1:8010
> [Wed Jan 31 18:31:10 2007] [2539586:0000] [error] jk_ajp_common.c 
> (1566): (PlatformServer) Tomcat is down or refused connection. No 
> response has been sent to the client (yet)
> [Wed Jan 31 18:31:10 2007] [2539586:0000] [info]  jk_ajp_common.c 
> (1877): (PlatformServer) receiving from tomcat failed, recoverable 
> operation attempt=0
> [Wed Jan 31 18:31:10 2007] [2539586:0000] [info]  jk_ajp_common.c 
> (1916): (PlatformServer) sending request to tomcat failed,  
> recoverable operation attempt=1
> [Wed Jan 31 18:31:10 2007] [2539586:0000] [info]  mod_jk.c (2142): 
> Service error=-5 for worker=PlatformServer
> [Wed Jan 31 18:31:11 2007] [2539586:0000] [info]  mod_jk.c (401): 
> Write without start, starting with defaults
> [Wed Jan 31 18:31:12 2007] [2539586:0000] [error] jk_ajp_common.c 
> (970): wrong message format 0x0400 from 127.0.0.1:8010
> [Wed Jan 31 18:31:12 2007] [2539586:0000] [error] jk_ajp_common.c 
> (1566): (PlatformServer) Tomcat is down or refused connection. No 
> response has been sent to the client (yet)
> [Wed Jan 31 18:31:12 2007] [2539586:0000] [error] jk_ajp_common.c 
> (1504): Unknown AJP protocol code: 02
> [Wed Jan 31 18:31:12 2007] [2539586:0000] [info]  jk_ajp_common.c 
> (1877): (PlatformServer) receiving from tomcat failed, recoverable 
> operation attempt=0
> [Wed Jan 31 18:31:12 2007] [2539586:0000] [info]  jk_ajp_common.c 
> (1916): (PlatformServer) sending request to tomcat failed,  
> recoverable operation attempt=1
> [Wed Jan 31 18:31:12 2007] [2539586:0000] [info]  jk_ajp_common.c 
> (1842): (PlatformServer) request failed, because of server error 
> without recovery in send loop attempt=0
> [Wed Jan 31 18:31:12 2007] [2539586:0000] [info]  mod_jk.c (2142): 
> Service error=-5 for worker=PlatformServer
>
> ---------------------------------------
>
> Here's the Tomcat log:
>
> Jan 31, 2007 6:31:10 PM org.apache.jk.common.HandlerRequest invoke
> INFO: Unknown message 0
> Jan 31, 2007 6:31:12 PM org.apache.jk.common.HandlerRequest invoke
> INFO: Unknown message 0
>
> ---------------------------------------
>
>
> And here's the only thing I know about dbx:
>
> dbx ../bin/httpd core
> Type 'help' for help.
> warning: The core file is not a fullcore. Some info may not be available.
> [using memory image in core]
> reading symbolic information ...
>
>
> Segmentation fault in sig_coredump at 0x10037590
> 0x10037590 (sig_coredump+0x84) 80410014        lwz   r2,0x14(r1)
> (dbx) where
> sig_coredump() at 0x10037590
> malloc_y.malloc_y(0x2000, 0x0, 0x1fe8, 0x30411700, 0xf879614d, 
> 0x7858c51, 0x17, 0x0) at 0xd03002c4
> malloc_common.malloc_common_53_36(??) at 0xd02fd8b8
> jk_pool_dyn_alloc() at 0xdc05a4c0
> jk_pool_alloc() at 0xdc05a5b4
> jk_b_set_buffer_size() at 0xdc06c0fc
> ajp_service() at 0xdc064c28
> jk_handler() at 0xdc050e60
> ap_run_handler() at 0x10003a24
> ap_invoke_handler() at 0x10004810
> ap_process_request() at 0x1002fd88
> ap_process_http_connection() at 0x1004d0dc
> ap_run_process_connection() at 0x10013dc0
> ap_process_connection() at 0x100144f0
> process_socket() at 0x1000b03c
> worker_thread() at 0x1000a8e0
> dummy_worker() at 0xdbdb8384
>
> ---------------------------------------
>
> I'll have a look and see if I can't get more information out of 
> tomcat.. there must be a way to increase the logging there.  I am 
> using mpm=worker.  If you still think it's worth the time I will 
> re-compile with prefork and take another shot.
>
> Thanks again for your help.
>
> Eric
>
> Rainer Jung wrote:
>> Hi Eric,
>>
>> maybe your observation is in fact related to Henri's, since you also 
>> get core dumps (segmentation faults), which is quite unusual. See 
>> further comments inline.
>>
>> Eric Wertman wrote:
>>> Sorry to jump in, I'm new here.  I started watching this list 
>>> because of a problem I'm having with the mod_jk 1.2.20 as well.  I'm 
>>> not getting core files, but I do have problems that I can't 
>>> reproduce at log level trace or debug.
>>
>> Apache logs segmentation faults. You might be able to produce a core 
>> (which would be very helpful) by setting the coredumpdirectory in 
>> apache and maybe tweaking your os config. One point that might help 
>> to produce cores, would be to not run apache as root.
>>
>>> I'm running it on AIX 5.3 (ml05).  I compiled the apache 2.2.4 and 
>>> apr-1.2.8 along with the mod_jk, and using tomcat 5.5.20 with an IBM 
>>> 1.5 JRE.
>>>
>>> I compiled them all using the IBM  cc_r compiler.  I've actually 
>>> tried a number of different flags trying to resolve my problem, 
>>> assuming initially that I was doing something incorrectly.  The 
>>> behavior persists, though, and my compilations have all been clean.  
>>> Apache without mod_jk doesn't give me any problems.
>>>
>>> Superficially it seems to work fine.  Once I run tests with ab, it 
>>> starts to get a little ugly.  I get a number of failed requests, and 
>>> these types of errors:
>>>
>>>
>>>
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (1504): Unknown AJP protocol code: 41
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (970): wrong message format 0x3837 from 127.0.0.1:8010
>>
>> hose mean, taht something is fundamentally wrong in the answer 
>> mopd_jk receives from tomcat. It doesn't really look like AJP/1.3. Is 
>> your MPM worker or prefork? Could you try again with prefork, in case 
>> it is worker?
>>
>> Do you get error message on the tomcat side?
>>
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (1566): (PlatformServer) Tomcat is down or refused connection. No 
>>> response has been sent to the client (yet)
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>>> (1877): (PlatformServer) receiving from tomcat failed, recoverable 
>>> operation attempt=0
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>>> (1916): (PlatformServer) sending request to tomcat failed,  
>>> recoverable operation attempt=1
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>>> (1842): (PlatformServer) request failed, because of server error 
>>> without recovery in send loop attempt=0
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  mod_jk.c (2142): 
>>> Service error=-5 for worker=PlatformServer
>>
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  mod_jk.c (401): 
>>> Write without start, starting with defaults
>>
>> Ths one is very unusual, and it is the log statement, that is the 
>> only one, that could have produced Henri's core dump, although we 
>> still do not know why. It correlates to a protocol error.
>>
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (970): wrong message format 0x031a from 127.0.0.1:8010
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (1592): (PlatformServer) Tomcat is down or network problems. Part of 
>>> the response has already been sent to the cli
>>> ent
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>>> (1877): (PlatformServer) receiving from tomcat failed, recoverable 
>>> operation attempt=1
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (1504): Unknown AJP protocol code: D8
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>>> (1842): (PlatformServer) request failed, because of server error 
>>> without recovery in send loop attempt=0
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  mod_jk.c (2142): 
>>> Service error=-5 for worker=PlatformServer
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>>> (1916): (PlatformServer) sending request to tomcat failed,  
>>> recoverable operation attempt=2
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (915): sendfull returned -14 with errno=14
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (948): (PlatformServer) can't receive the response message from 
>>> tomcat, network problems or tomcat is down (127.0.
>>> 0.1:8010), err=-9
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (1566): (PlatformServer) Tomcat is down or refused connection. No 
>>> response has been sent to the client (yet)
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>>> (1877): (PlatformServer) receiving from tomcat failed, recoverable 
>>> operation attempt=0
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (1292): (PlatformServer) failed resending request body (812760852)
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>>> (1916): (PlatformServer) sending request to tomcat failed,  
>>> recoverable operation attempt=3
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>>> (1916): (PlatformServer) sending request to tomcat failed,  
>>> recoverable operation attempt=1
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (915): sendfull returned -9 with errno=9
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>>> (1201): (PlatformServer) error sending request. Will try another 
>>> pooled connection
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (915): sendfull returned -14 with errno=14
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (948): (PlatformServer) can't receive the response message from 
>>> tomcat, network problems or tomcat is down (127.0.
>>>
>>> And these in the apache error_log:
>>>
>>> [Tue Jan 30 03:02:28 2007] [info] Shared memory session cache 
>>> initialised
>>> [Tue Jan 30 03:02:28 2007] [info] Init: Initializing (virtual) 
>>> servers for SSL
>>> [Tue Jan 30 03:02:28 2007] [info] Server: Apache/2.2.4, Interface: 
>>> mod_ssl/2.2.4, Library: OpenSSL/0.9.8d
>>> [Tue Jan 30 03:02:28 2007] [notice] Apache/2.2.4 (Unix) 
>>> mod_jk/1.2.20 mod_ssl/2.2.4 OpenSSL/0.9.8d configured -- resuming 
>>> normal operations
>>> [Tue Jan 30 03:02:28 2007] [info] Server built: Jan 29 2007 23:44:30
>>> [Tue Jan 30 03:02:28 2007] [debug] worker.c(1740): AcceptMutex: 
>>> sysvsem (default: sysvsem)
>>> [Tue Jan 30 03:03:54 2007] [notice] child pid 2203720 exit signal 
>>> Segmentation fault (11)
>>> [Tue Jan 30 03:04:00 2007] [debug] worker.c(1083): the listener 
>>> thread didn't exit
>>> [Tue Jan 30 03:04:01 2007] [debug] worker.c(1083): the listener 
>>> thread didn't exit
>>> [Tue Jan 30 03:04:13 2007] [debug] worker.c(1354): taking over 
>>> scoreboard slot from 1552520 (quiescing)
>>> [Tue Jan 30 03:04:19 2007] [notice] child pid 2576392 exit signal 
>>> Segmentation fault (11)
>>> [Tue Jan 30 03:04:23 2007] [notice] child pid 2310294 exit signal 
>>> Segmentation fault (11)
>>> [Tue Jan 30 03:04:26 2007] [notice] child pid 2576394 exit signal 
>>> Segmentation fault (11)
>>> [Tue Jan 30 03:29:30 2007] [notice] child pid 1552524 exit signal 
>>> Segmentation fault (11)
>>> [Tue Jan 30 04:09:22 2007] [notice] child pid 2584796 exit signal 
>>> Segmentation fault (11)
>>>
>>>
>>> Again,  I thought I would be able to track them down by changing the 
>>> log level to "trace", but the problem disappears at that log level.  
>>> I had assumed it was slowing the connection rate down enough to 
>>> mitigate the problem.  Not sure if any of that helps.. if you can 
>>> help me or want more info let me know.
>>
>> With JkLogLevel debug (which logs more than info but less than trace) 
>> we would see, which packets you receive from tomcat. Maybe debug 
>> still shows the problem?
>>
>> Could you please post your config (Jk* from apache, workers.roperties 
>> and Connector from tomcat).
>>
>>>
>>> Thanks!
>>>
>>> Eric
>>
>> Regards,
>>
>> Rainer
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: dev-help@tomcat.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: dev-help@tomcat.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Mime
View raw message