Mailing-List: contact tomcat-dev-help@jakarta.apache.org; run by ezmlm
Precedence: bulk
Reply-To: "Tomcat Developers List" <tomcat-dev@jakarta.apache.org>
From: "Hans Schmid" <Hans.Schmid@einsurance.de>
To: "Tomcat-Dev" <tomcat-dev@jakarta.apache.org>
Subject: jk 1.2.4   LB bug?
Date: Wed, 9 Jul 2003 18:14:45 +0200
Message-ID: <AJEDIBIPIJCIFOOPAKLLOEKCELAA.Hans.Schmid@einsurance.de>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit
Importance: Normal
In-Reply-To: <3F0C1F08.2050601@mail.more.net>

Hi,

I noticed the following with mod_jk 1.2.4, Apache 1.3.26 and
Tomcat 3.3.1a on Solaris 8 JDK 1.4.1_03.

Maybe a LB bug (Loadfactors do not recover after startup of new
tomcat/graceful Apache restart).

Let me explain my scenario first:

I want to gracefully upgrade our webapp without loosing sessions + have a
fail over scenario.
Therefor we have sticky sessions enabled.

Setup:
1 tomcat 01 running on Server A,
0 tomcat 02 running on Server A,
1 tomcat SB running on Server B

01 tomcat on Server A runs the application, SB tomcat on server B is
standby(fallback),
02 tomcat is shutdown on Server A at the moment.

All three Tomcats are in the same lb_worker:


worker.list=lb-worker

worker.ajp13-01.port=11009
worker.ajp13-01.host=A
worker.ajp13-01.type=ajp13
worker.ajp13-01.lbfactor=1
worker.ajp13-01.local_worker=1

worker.ajp13-02.port=11019
worker.ajp13-02.host=A
worker.ajp13-02.type=ajp13
worker.ajp13-02.lbfactor=1
worker.ajp13-02.local_worker=0

worker.ajp13-sb.port=11015
worker.ajp13-sb.host=B
worker.ajp13-sb.type=ajp13
worker.ajp13-sb.lbfactor=0
worker.ajp13-sb.local_worker=1

worker.lb-worker.type=lb
worker.lb-worker.balanced_workers=ajp13-01, ajp13-02, ajp13-sb
worker.lb-worker.local_worker_only=0


The worker List order should now be:
 1. worker.ajp13-01 lbfactor=1,local_worker=1  TC 01
 2. worker.ajp13-sb lbfactor=0,local_worker=1  TC SB
 3. worker.ajp13-02 lbfactor=1,local_worker=0) TC 02  (not running)

Now all requests go to worker.ajp13-01 (TC 01), none to worker.ajp13-sb (TC
SB lbfactor=0),
none to worker.ajp13-02.port (TC 02 not running).

If Server A crashes (TC 01) all new requests go to Server B (TC SB
worker.ajp13-sb)
since this is then the only running Tomcat FINE
This is our Fail-Over Solution (lost running sessions, but OK).

Now the webapp update Scenario:

1.) shutdown TC SB on Server B, update the webapp, start tc SB and test via
a seperate HTTPConnector port without Apache.
2.) this does not affect anything on production, since the lbfactor=0 on TC
SB
-> no sessions arrive on tc SB
3.) When the test was successful, our Standby Tomcat SB is updated
4.) Now upgrade the webapp on Server A TC 02, which is currently not
running.
5.) Start up TC 02 on Server A with the new version of the webapp,
immediately exchange the worker.properties with a different version and
gracefully restart apache:

worker.list=lb-worker

worker.ajp13-01.port=11009
worker.ajp13-01.host=A
worker.ajp13-01.type=ajp13
worker.ajp13-01.lbfactor=1
worker.ajp13-01.local_worker=0     <---- put old webapp on TC 01 to the
foreign worker list

worker.ajp13-02.port=11019
worker.ajp13-02.host=A
worker.ajp13-02.type=ajp13
worker.ajp13-02.lbfactor=1
worker.ajp13-02.local_worker=1     <---- put new webapp on TC 02 in front of
the local worker list

worker.ajp13-sb.port=11015
worker.ajp13-sb.host=B
worker.ajp13-sb.type=ajp13
worker.ajp13-sb.lbfactor=0
worker.ajp13-sb.local_worker=1

worker.lb-worker.type=lb
worker.lb-worker.balanced_workers=ajp13-01, ajp13-02, ajp13-sb
worker.lb-worker.local_worker_only=0

Just the two lines marked above with <---- swap
(local_worker values of TC 01 and TC 02)

6.) now all 3 Tomcats are running. All existing sessions still go to TC 01
(sticky sessions; we do not loose running sessions)
7.) What I expect:
TC 02 takes a while to startup.
The worker List order should now be:
 1. worker.ajp13-02 lbfactor=1,local_worker=1  TC 02
 2. worker.ajp13-sb lbfactor=0,local_worker=1  TC SB
 3. worker.ajp13-01 lbfactor=1,local_worker=0) TC 01  (old webapp)

Since TC 02 needs 3 minutes to start up (filling caches etc.) it is not
immediately availlable.
During this time new sessions arrive at TC SB, since it is the next in the
worker list. OK fine this works.
Since these sessions are sticky as well, all users connecting during this
time stay on TC SB
during their whole session life. FINE

8.) As soon as TC 02 is up and running (finished all load-on-startup servlet
initialisition stuff)
I would expect that TC 02 gets all new Sessions (Number 1 in the worker
List).

This is not the case! All new Sessions still arrive at TC SB.

9.) After a while (one hour) we shutdown TC 01. Since no new sessions
arrived there since our
graceful restart of Apache, all old Sessions should have expired.

10.) even now (only 2 Tomcats running TC 02  and TC SB) and even after a
graceful restart new Sessions
arrive at TC SB


Conclusion:
Now, do I misunderstand the supposed behaviour of lbfactor and local_worker
flag ?
I think that the behaviour in 8.) is wrong. 10.) is starange too.

Thanks for any suggestion if I am completely wrong here
or further looking into this.

Hans


> -----Ursprungliche Nachricht-----
> Von: Glenn Nielsen [mailto:glenn@mail.more.net]
> Gesendet: Mittwoch, 9. Juli 2003 15:56
> An: Tomcat Developers List
> Betreff: Re: jk 1.2.25 release ?
>
>
> I was hoping to get it released this week.
>
> But I just noticed that under Apache 2 mod_jk piped logs there
> are two instances of the piped log program running for the same
> log file.  I want to track this down.
>
> I also just implemented load balancing this morning on a production
> server.  I noticed that when none of the workers for the load balancer
> were available an HTTP status code of 200 was being logged in mod_jk.log
> when request logging was enabled. So I want to look into this also.
>
> Hopefully now that I have load balancing in place with 2 tomcat servers
> instead of 1 the Missouri Lottery web site I administer will scale to
> handle the big spike in load tonight for the $240 PowerBall jackpot. :-)
>
> Regards,
>
> Glenn
>
> Henri Gomez wrote:
> > Any date ?
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: tomcat-dev-help@jakarta.apache.org
> >
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: tomcat-dev-help@jakarta.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-dev-help@jakarta.apache.org