apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joshua Marantz <jmara...@google.com>
Subject Re: apr_memcache operation timeouts
Date Thu, 18 Oct 2012 14:46:17 GMT
Thanks Jeff, that is very helpful.  We are considering a course of action
and before doing any work toward this, I'd like to understand the pitfalls
from people that understand Apache better than us.

Here's our reality: we believe we need to incorporate memcached for
mod_pagespeed <http://modpagespeed.com> to scale effectively for very large
sites & hosting providers.  We are fairly close (we think) to releasing
this functionality as beta.  However, in such large sites, stuff goes
wrong: machines crash, power failure, fiber cut, etc.  When it does we want
to fall back to serving partially unoptimized sites rather than hanging the
servers.

I understand the realities of backward-compatible APIs.  My expectation is
that this would take years to make it into an APR distribution we could
depend on.  We want to deploy this functionality in weeks.  The workarounds
we have tried backgrounding the apr_memcache calls in a thread and timing
out in mainline are complex and even once they work 100% will be very
unsatisfactory (resource leaks; Apache refusing to exit cleanly on
'apachectl stop') if this happens more than (say) once a month.

Our plan is to copy the patched implementation of
apr_memcache_server_connect and the static methods it calls into a new .c
file we will link into our module, naming the new entry-point something
else (apr_memcache_server_connect_with_timeout seems good).  From a CS/SE
perspective this is offensive and we admit it, but from a product quality
perspective we believe this beats freezes and complicated/imperfect
workarounds with threads.

So I have two questions for the Apache community:

   1. What are the practical problems with this approach?  Note that in any
   case a new APR rev would require editing/ifdefing our code anyway, so I
   think immunity from APR updates such as this patch being applied is not a
   distinguishing drawback.
   2. Is there an example of the correct solution to the technical problem
   Jeff highlighted: "it is apparently missing a call to adjust the socket
   timeout and to discard the connection if the timeout is reached".  That
   sounds like a pattern that might be found elsewhere in the Apache HTTPD
   code base.

Thanks in advance for your help!
-Josh


On Wed, Oct 17, 2012 at 8:16 PM, Jeff Trawick <trawick@gmail.com> wrote:

> On Wed, Oct 17, 2012 at 3:36 PM, Joshua Marantz <jmarantz@google.com>
> wrote:
> > Is there a mechanism to time out individual operations?
>
> No, the socket connect timeout is hard-coded at 1 second and the
> socket I/O timeout is disabled.
>
> Bugzilla bug https://issues.apache.org/bugzilla/show_bug.cgi?id=51065
> has a patch, though it is apparently missing a call to adjust the
> socket timeout and to discard the connection if the timeout is
> reached.  More importantly, the API can only be changed in future APR
> 2.0; alternate, backwards-compatible API(s) could be added in future
> APR-Util 1.6.
>
> >
> > If memcached freezes, then it appears my calls to 'get' will block until
> > memcached wakes up.  Is there any way to set a timeout for that call?
> >
> > I can repro this in my unit tests by sending a SIGSTOP to memcached
> before
> > doing a 'get'?
> >
> > Here are my observations:
> >
> > apr_memcache_multgetp seems to time out in bounded time if I SIGSTOP the
> > memcached process. Yes!
> >
> > apr_memcache_getp seems to hang indefinitely if I SIGSTOP the memcached
> > process.
> >
> > apr_memcache_set seems to hang indefinitely if I SIGSTOP the memcached
> > process.
> >
> > apr_memcache_delete seems to hang indefinitely if I SIGSTOP the memcached
> > process.
> >
> > apr_memcache_stats seems to hang indefinitely if I SIGSTOP the memcached
> > process.
> >
> > That last one really sucks as I am using that to print the status of all
> my
> > cache shards to the log file if I detected a problem :(
> >
> >
> > Why does apr_memcache_multgetp do what I want and not the others?  Can I
> > induce the others to have reasonable timeout behavior?
> >
> > When I SIGSTOP memcached this makes it hard to even restart Apache, at
> > least with graceful-stop.
> >
> >
> > On a related note, the apr_memcache
> > documentation<
> http://apr.apache.org/docs/apr-util/1.4/group___a_p_r___util___m_c.html>is
> > very thin.  I'd be happy to augment it with my observations on its
> > usage
> > and the meaning of some of the arguments if that was desired.  How would
> I
> > go about that?
>
> Check out APR trunk from Subversion, adjust the doxygen docs in the
> include files, build (make dox) and inspect the results, submit a
> patch to dev@apr.apache.org.
>
> >
> > -Josh
>
>
>
> --
> Born in Roswell... married an alien...
> http://emptyhammock.com/
>

Mime
View raw message