tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shinta Tjio <st...@broadjump.com>
Subject RE: Design Review for ajp13's changes: WAS problem w/ ajp13 - if Tomc at is shutdown
Date Thu, 08 Mar 2001 23:03:44 GMT
> > 3) For option (1), I have a few questions
> >
> >  - Is there a way in which data could be lost?  Specifically, 
> >as you state,
> >the send() will return without error, and then it will only 
> >get the error on
> >the following read().  Is all the data always preserved so 
> that simply
> >retrying will work correctly?  I think most of that state is in the
> >jk_ws_service_t object -- is it possible a read pointer will 
> >be advanced and
> >data will be lost?  This may be acceptable, but I'd like to 
> >understand it...
> 
> A send to a closed or (in our case half-closed) socket will 95%
> of the time return a positive number (#bytes sent). It's sad
> but even if the IP stack known the socket was closed you'll know
> about it only at the next read/recv or select on readstate !
> Checking send < 0 is not a great help here.

Yup. It's sad indeed. It has caused us some grieves.
 
> > - You only retry once.  If there are a number of connections 
> >open (from a
> >single Apache process), isn't it possible that Tomcat has come 
> >back up, and
> >that the next connection obtained (from the endpoint cache), 
> >will also be
> >stale?  Would it make sense in this case to trigger a shutdown 
> >of all the
> >connections currently in the cache (and then retry once)?  
> >That would make
> >sense if there were no other ways to get a ECONNRESET error.  
> 
> ECONNRESET MUST'NT be checked like this, mod_jk code run
> on Apache 1.3 AND 2.0 and this one is multithreaded.

Is there a better way to do this? I want to handle only
ECONNRESET, because that's recoverable. Other errors may
not be recoverable and there's no point of retrying. 
But then again, we can just let it go to retry and fail
there.

> > - Or, more generally, just so I (and everyone) can 
> >understand, how does
> >this new code deal with the following stages:
> >
> >  1) TC and Apache both up and running
> >
> >  2) TC is shutdown
> >If mod_jk is in the middle of handling a request, what 
> >happens?  There was
> >an infinite loop in the 3.2.1 code, but that's been fixed in 
> >3.2.2 and 3.3.
> 
> Apache send datas (no error) and then wait reply with recv.
> There Apache got the error. We must restart the request sent
> at least one time. Little code to reorganize in ajp13_worker.

Are you working on re-organizing the changes? Will you post
the changes when you're done?

> >  3) TC is shutdown, Apache is still up.  While TC is down, 
> >requests come
> >in.  How are they handled?  Are there any loops Apache gets stuck in?
> 
> Apache will determine that TC is down and try another socket. 
> But we must be carefull here with load-balancing configs.

I'll try to test this with some load-balancing Tomcat.

> >  4) TC starts back up.  Now requests get handled smoothly again? 
> 
> Yes, but only if the socket were closed by Apache before.
> 
> > 4) For option (2):
> >
> > - If the user has Win32, you're just punting, correct?  Why 
> >is that?  I
> >know nothing about Win32 socket programming, but I'm 
> >curious...  You say
> >you're testing on Win2k -- does Win2k support select(), but 
> >win32 doesn't? 
> >Does anyone know about how widely select() is supported?
> 
> select is just to much time consuming to be used at EACH request.
> We must handle the potential error not loose to many cpu cycles
> when everything is fine.
> 
> I'm -1 using select and errno.

thanks,
shinta 

Mime
View raw message