Now that we've established that the TTL passed into the server-create call is for reaping idle connections and not individual operation timeouts, I want to ask about timing out individual operations.

If memcached freezes, then it appears my calls to 'get' will block until memcached wakes up.  Is there any way to set a timeout for that call?

I can repro this in my unit tests by sending a SIGSTOP to memcached before doing a 'get'.

-Josh


On Thu, Sep 27, 2012 at 4:37 PM, Joshua Marantz <jmarantz@google.com> wrote:
This helps a lot.  I think 600 seconds seems like a fine idle-reap timeout.

I need to investigate why some lookups take a second or more.  Maybe there's a mutex contention on my end somewhere.

Thanks!
-Josh



On Thu, Sep 27, 2012 at 2:08 PM, Jeff Trawick <trawick@gmail.com> wrote:
On Thu, Sep 27, 2012 at 1:55 PM, Joshua Marantz <jmarantz@google.com> wrote:
> That one call-site is HTTP_24/src/modules/cache/mod_socache_memcache.c,
> right?  That was where I stole my args from.

no, subversion

> As the TCP/IP layer is a lower level abstraction than bathe apr_memcache
> interface, I'm still not clear on exactly what that means.  Does a value of
> 600 mean that a single multiget must complete in 600 microseconds otherwise
> it fails with APR_TIMEUP?

ttl only affects connections which are not currently used; it does not
control I/O timeouts


> That might explain the behavior I saw.
>
> I've now jacked that up by x1e6  to 600 seconds and I don't see timeouts,
> but I'm hoping someone can bridge the gap between the socket-level
> explanation and the apr_memcache API call.
>
> I was assuming that apr_memcache created the TCP/IP connection when I called
> apr_memcache_server_create, and there even 600 seconds seems too short.  Is
> the functionality more like it will create connections on-demand and leave
> them running for N microseconds, re-using the connection for multiple
> requests until TTL microseconds have elapsed since creation?

create on demand
reuse existing idle connections when possible
when performing maintenance on the idle connections, clean up any
which were idle for N microseconds

If a connection is always reused before it is idle for N microseconds,
it will live as long as memcached allows.

> If that's the case then I guess that every 10 minutes one of my cache
> lookups may have high latency to re-establish the connection, is that right?
> I've been histogramming this under load and seeing some long tail requests
> with very high latency.  My median latency is only 143us which is great.
> My 90%, 95% and 99% are all around 5ms, which is fine as well.  But I've got
> a fairly significant number of long-tail lookups that take hundreds of ms or
> even seconds to finish, and one crazy theory is that this is all reconnect
> cost.
>
> It would be nice if the TTL were interpreted as a maximum idle time before
> the connection is reaped, rather than stuttering response-time on a very
> active channel.

It is.  The ttl is interpreted by the reslist layer, which won't touch
objects until they're returned to the list.

>
> This testing is all using a single memcached running on localhost.
>
> -Josh
>
>
> On Thu, Sep 27, 2012 at 11:24 AM, Jeff Trawick <trawick@gmail.com> wrote:
>>
>> On Thu, Sep 27, 2012 at 11:15 AM, Joshua Marantz <jmarantz@google.com>
>> wrote:
>> > On Thu, Sep 27, 2012 at 10:58 AM, Ben Noordhuis <info@bnoordhuis.nl>
>> > wrote:
>> >>
>> >>   If dlsym() is called with the special handle NULL, it is interpreted
>> >> as
>> >> a
>> >>   reference to the executable or shared object from which the call is
>> >> being
>> >>   made.  Thus a shared object can reference its own symbols.
>> >>
>> >> And that's how it works on Linux, Solaris, NetBSD and probably OpenBSD
>> >> as
>> >> well.
>> >
>> >
>> > Cool, thanks.
>> >>
>> >> > Do you have a feel for the exact meaning of that TTL parameter to
>> >> > apr_memcache_server_create?
>> >>
>> >> You mean what units it uses? Microseconds (at least, in 2.4).
>> >
>> >
>> > Actually what I meant was what that value is used for in the library.
>> > The
>> > phrase "time to live of client connection" confuses me.  Does it really
>> > mean
>> > "the maximum number of seconds apr_memcache is willing to wait for a
>> > single
>> > operation?  Or does it mean *both*, implying that a fresh TCP/IP
>> > connection
>> > is made for every new operation, but will stay alive for only a certain
>> > number of seconds.
>>
>> TCP/IP connections, once created, will be retained for the specified
>> (ttl) number of seconds.  They'll be created when needed.
>>
>> The socket connect timeout is hard-coded to 1 second, and there's no
>> timeout for I/O.
>>
>> >
>> >
>> > It is a little disturbing from a module-developer perspective to have
>> > the
>> > meaning of that parameter change by a factor of 1M between versions.
>> > Would
>> > it be better to revert the recent change and instead change the doc to
>> > match
>> > the current behavior?
>>
>> The doc was already changed to match the behavior, but I missed that.
>> The caller I know of used the wrong unit, and I'll submit a patch to
>> fix that in the caller, as well as revert my screw-up from yesterday.
>>
>> >
>> > -Josh
>> >
>>
>>
>>
>> --
>> Born in Roswell... married an alien...
>> http://emptyhammock.com/
>
>



--
Born in Roswell... married an alien...
http://emptyhammock.com/