trafficserver-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Plevyak <jplev...@acm.org>
Subject Re: TS-857, TS-934, TS-1031
Date Tue, 13 Dec 2011 21:30:06 GMT
This worked for 15 years.  This is how Traffic Server was designed and how
it has run for years at AOL with uptime measured in months under very high
load on multi-processors with weaker memory guarantees than x86 (e.g.
Alpha).

Some new bug comes along and suddenly something which has been field tested
for over a decade is now "does not work".  Ha.


> > Clear all pointers to a NetVC then call vc->do_io_close.
>
> > Period.
>
> That does not work. Period. You can crash calling vc->do_io_close if
> another thread de-allocates it at the same time. I also don't see how you
> can clear all pointers when those pointer are in two different instances of
> two different classes in two different threads. Also, how do you call
> vc->do_io_close if you've cleared all the pointers?
>

No other thread can call vc->do_io_close if they don't have the pointer to
it.  It is up to the protocol engine to ensure that all pointers are clear.
 That means that you need to ensure that even if there are two different
classes in two different threads that the pointers to the NetVC are stored
under the same instance of Mutex.  That is just how multi-threaded
programming works.  A Mutex defines a region of memory which can be
accessed at only by one thread at a time.  So you clear all pointers except
the last one, call vc->do_io_closed() then clear the last via vc = NULL and
you are done.

This is multi-threaded programming 101.  If you can't handle this then you
can't write multi-threaded code.


>
> Perhaps if you could be more specific about to actually implement your
> recommendation, I might be able to see how to fix this problem.
>

It sounds like there is a bug in the way that the HTTP sessions are
handled.  I need to look at the stack traces and see if I can figure out
where the bug is.  I you have some way to reproduce the problem that would
be great too.

Smart pointers are a way of hiding the bugs, not a solution.  We don't want
hidden bugs and code accessing stale memory.   Buggy code should crash as
soon as possible so that the bug can be fixed and the system made stable.

john

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message