tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Schultz <ch...@christopherschultz.net>
Subject Re: Antwort: Re: mod_jk doesn`t distribute and failover on tomcat-error
Date Tue, 20 Sep 2011 17:12:45 GMT
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Steffen,

On 9/20/2011 3:18 AM, Steffen.Scheuler@fiducia.de wrote:
>> 1. You should look into using "template" workers.
> Yes, the configuration would be cleaner, but is it a functional
> problem?

No, but it allows people to more easily debug your configuration
without having to cross-check all the settings between workers. Just
read them once and know that they are all the same.

>> 2. Unless you really want to explicitly set all those properties,
>> don't set anything that is the same as a documented default.
>> There's no reason to specify all those details.
> In some cases, we set the same value as the default values to
> avoid problems whit changed defaults.

Fair enough.

>> So, 10 requests were sent to Server1 during this minute? That
>> sounds reasonable, given:
>> 
>>> workerServer1.retry_interval=100
>> 
>> That means that mod_jk will try 10 times per second to reach
>> Server1 when it's in an error state.
>> 
>>> 10:22, 56 238 243 261 250 247 10:23, 10 728 742 716 740 761
>> 
>> 56 seems high, but that might be due to multiple httpd workers
>> all re-trying.
>> 
> The number of requests in my analysis are really processed requests
> with RC=200 not just tries (unfortunatly i donĀ“t see tries in
> Loglevel error).

You might want to bump-up your LogLevel while investigating, then.

>>> 10:54, 686 549 562 506 529 548
>> 
>> So, this is when Server1 becomes operational again? Did you have
>> to use the status worker to trigger mod_jk to allow it back into
>> the cluster, or did it recover on it's own?
> the behavior past 10:22, is reasonable to me, mod_jk recovered on
> its own there was no operational intervention until 10:37, when the
> faulted server was deaktivated in status and restarted.
> 
> The problem is the behavior from 10:15 to 10 10:21, because no
> request was routed to the operational Servers.

Woah, so the entire cluster stalled? That's clearly a problem. Or, did
you just not get any good sampling data for those time periods? I only
see errors for "Server1". Were there other errors? Where are you
getting your numbers for your table with headings "Timest.	   Srv1
Srv2  Srv3  Srv4  Srv5  Srv6"?

>> You might want to consider configuring mod_jk to use a
>> "ping_mode" for activation management.
> 
> Which ping_mode do you recommend me for trying this on a Server
> with that number of requests?  "P" seems to be much overhead.....

You might try "I", but I would consult with your networking folks to
determine which strategy might be best. Honestly, 50 req/sec shouldn't
be a problem to do a prepost connection check: they are fairly fast.
But you're right, it would slow things down just a bit. What is your
expected response time for most of those requests?

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk54yY0ACgkQ9CaO5/Lv0PAeYACgilJAEsbMTV6yLYwDHmhkNXWO
DH4AoLeF41/MQDB80DNMy9mNWBaa/Prw
=2S9J
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message