apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joshua Marantz <jmara...@google.com>
Subject Re: aprmemcache question
Date Sat, 13 Oct 2012 13:53:06 GMT
Now that we've established that the TTL passed into the server-create call
is for reaping idle connections and not individual operation timeouts, I
want to ask about timing out individual operations.

If memcached freezes, then it appears my calls to 'get' will block until
memcached wakes up.  Is there any way to set a timeout for that call?

I can repro this in my unit tests by sending a SIGSTOP to memcached before
doing a 'get'.

-Josh


On Thu, Sep 27, 2012 at 4:37 PM, Joshua Marantz <jmarantz@google.com> wrote:

> This helps a lot.  I think 600 seconds seems like a fine idle-reap timeout.
>
> I need to investigate why some lookups take a second or more.  Maybe
> there's a mutex contention on my end somewhere.
>
> Thanks!
> -Josh
>
>
>
> On Thu, Sep 27, 2012 at 2:08 PM, Jeff Trawick <trawick@gmail.com> wrote:
>
>> On Thu, Sep 27, 2012 at 1:55 PM, Joshua Marantz <jmarantz@google.com>
>> wrote:
>> > That one call-site is HTTP_24/src/modules/cache/mod_socache_memcache.c,
>> > right?  That was where I stole my args from.
>>
>> no, subversion
>>
>> > As the TCP/IP layer is a lower level abstraction than bathe apr_memcache
>> > interface, I'm still not clear on exactly what that means.  Does a
>> value of
>> > 600 mean that a single multiget must complete in 600 microseconds
>> otherwise
>> > it fails with APR_TIMEUP?
>>
>> ttl only affects connections which are not currently used; it does not
>> control I/O timeouts
>>
>>
>> > That might explain the behavior I saw.
>> >
>> > I've now jacked that up by x1e6  to 600 seconds and I don't see
>> timeouts,
>> > but I'm hoping someone can bridge the gap between the socket-level
>> > explanation and the apr_memcache API call.
>> >
>> > I was assuming that apr_memcache created the TCP/IP connection when I
>> called
>> > apr_memcache_server_create, and there even 600 seconds seems too short.
>>  Is
>> > the functionality more like it will create connections on-demand and
>> leave
>> > them running for N microseconds, re-using the connection for multiple
>> > requests until TTL microseconds have elapsed since creation?
>>
>> create on demand
>> reuse existing idle connections when possible
>> when performing maintenance on the idle connections, clean up any
>> which were idle for N microseconds
>>
>> If a connection is always reused before it is idle for N microseconds,
>> it will live as long as memcached allows.
>>
>> > If that's the case then I guess that every 10 minutes one of my cache
>> > lookups may have high latency to re-establish the connection, is that
>> right?
>> > I've been histogramming this under load and seeing some long tail
>> requests
>> > with very high latency.  My median latency is only 143us which is great.
>> > My 90%, 95% and 99% are all around 5ms, which is fine as well.  But
>> I've got
>> > a fairly significant number of long-tail lookups that take hundreds of
>> ms or
>> > even seconds to finish, and one crazy theory is that this is all
>> reconnect
>> > cost.
>> >
>> > It would be nice if the TTL were interpreted as a maximum idle time
>> before
>> > the connection is reaped, rather than stuttering response-time on a very
>> > active channel.
>>
>> It is.  The ttl is interpreted by the reslist layer, which won't touch
>> objects until they're returned to the list.
>>
>> >
>> > This testing is all using a single memcached running on localhost.
>> >
>> > -Josh
>> >
>> >
>> > On Thu, Sep 27, 2012 at 11:24 AM, Jeff Trawick <trawick@gmail.com>
>> wrote:
>> >>
>> >> On Thu, Sep 27, 2012 at 11:15 AM, Joshua Marantz <jmarantz@google.com>
>> >> wrote:
>> >> > On Thu, Sep 27, 2012 at 10:58 AM, Ben Noordhuis <info@bnoordhuis.nl>
>> >> > wrote:
>> >> >>
>> >> >>   If dlsym() is called with the special handle NULL, it is
>> interpreted
>> >> >> as
>> >> >> a
>> >> >>   reference to the executable or shared object from which the call
>> is
>> >> >> being
>> >> >>   made.  Thus a shared object can reference its own symbols.
>> >> >>
>> >> >> And that's how it works on Linux, Solaris, NetBSD and probably
>> OpenBSD
>> >> >> as
>> >> >> well.
>> >> >
>> >> >
>> >> > Cool, thanks.
>> >> >>
>> >> >> > Do you have a feel for the exact meaning of that TTL parameter
to
>> >> >> > apr_memcache_server_create?
>> >> >>
>> >> >> You mean what units it uses? Microseconds (at least, in 2.4).
>> >> >
>> >> >
>> >> > Actually what I meant was what that value is used for in the library.
>> >> > The
>> >> > phrase "time to live of client connection" confuses me.  Does it
>> really
>> >> > mean
>> >> > "the maximum number of seconds apr_memcache is willing to wait for
a
>> >> > single
>> >> > operation?  Or does it mean *both*, implying that a fresh TCP/IP
>> >> > connection
>> >> > is made for every new operation, but will stay alive for only a
>> certain
>> >> > number of seconds.
>> >>
>> >> TCP/IP connections, once created, will be retained for the specified
>> >> (ttl) number of seconds.  They'll be created when needed.
>> >>
>> >> The socket connect timeout is hard-coded to 1 second, and there's no
>> >> timeout for I/O.
>> >>
>> >> >
>> >> >
>> >> > It is a little disturbing from a module-developer perspective to have
>> >> > the
>> >> > meaning of that parameter change by a factor of 1M between versions.
>> >> > Would
>> >> > it be better to revert the recent change and instead change the doc
>> to
>> >> > match
>> >> > the current behavior?
>> >>
>> >> The doc was already changed to match the behavior, but I missed that.
>> >> The caller I know of used the wrong unit, and I'll submit a patch to
>> >> fix that in the caller, as well as revert my screw-up from yesterday.
>> >>
>> >> >
>> >> > -Josh
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Born in Roswell... married an alien...
>> >> http://emptyhammock.com/
>> >
>> >
>>
>>
>>
>> --
>> Born in Roswell... married an alien...
>> http://emptyhammock.com/
>>
>
>

Mime
View raw message