trafficserver-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leif Hedstrom <zw...@apache.org>
Subject Re: Can ATS keep more than 500mbit/s traffic for single instance?
Date Sun, 24 Nov 2013 17:44:56 GMT

On Nov 21, 2013, at 8:57 PM, Adam W. Dace <colonelforbin74@gmail.com> wrote:

> Also, once you've gotten past your immediate problem and are looking to deploy my Wiki
page may help:
> 
> https://cwiki.apache.org/confluence/display/TS/WebProxyCacheTuning


Some comments on this (thanks for collecting these tidbits!):

1. Fuzzy logic: As described here, is not what it does, at all. Fuzzy logic is there to allow
(random chance) a client to go to origin *before* the object is stale in cache. The idea is
that you would (for some reasonably active objects) prefetch the object such that you always
have it fresh in cache.

2. Session sharing. I’m very worried about your “performance problems” with session
sharing set to “2”. In all respects, “2” should be better for performance on a reasonably
busy system. On a system with *very* few connections, setting it to “1” might be better,
but in such a setup, I’d run with a single net-thread anyways (see below, because I think
you didn’t configure that correctly).

3. The “CPU cores” configuration is not what that setting means, at all. It has nothing
to do with the CPUs. The default is 1.5 worker threads (net-threads) per CPU, and setting
just proxy.config.exec_thread.limit has no effect whatsoever, without also setting proxy.config.exec_thread.autoconfig
to 0.

I’d be curious to hear if you actually did see a different with changing exect_thread_limit
like the document said, because it should have no impact. If you did indeed set the actual
number of net-threads to “1”, and noticed a difference in some behavior (performance,
stability etc.), please file a bug on it.

4. HTTP connections: The config recommendations talks about the two pipelining configurations.
These actually have no effect on the server at all. In fact, they should be removed. See https://issues.apache.org/jira/browse/TS-2077
for some details.

5. Background fill. This recommendation is wrong, it should be set to 0.0 for it to always
kick in. It allows the server to continue fetching / caching a large object even if the client
disconnects. This setting (with a value of 0.0) is a prerequisite for getting read-while-writer
to kick in.

6. HTTP cache options. PLEASE, do no set this to 0 unless you own both the cache and the origin.
In “0”, it allows everything to be cached, unless it’s explicitly denied. This is a
direct violation of RFC2616, and will in forward proxy, will  certainty break things where
more than one user is behind the cache. Set it to “1”, which can still break for poorly
behaved web sites (e.g. amazon.com used to break with it set to “1”, which is why we set
it to “2”.

The only times you should set this to “0” is a) If you are testing something or b) you
own all the origin content, such that you know everything can be cached for arbitrary amounts
of time.

7. The description for proxy.config.http.cache.max_stale_age is not accurate. It has nothing
to do with flushing the cache, in fact, we never flush the cache. What this setting says is,
if an object is stale in cache, and you can *not* get a connection to the origin server, you
are allowed to use the staled object in cache for this long. I would not change this from
the defaults personally.

8 . Turning off proxy.config.http.cache.range.lookup (range requests) does not tell ATS that
it can’t use Range: requests. It completely disabled Range: support for all clients. This
is almost never what you want :). Leave it on.

9. The proxy.config.http.cache.heuristic_max_lifetime is mostly correct, except for two things:
1) It does not get “flushed”, it merely says how long (max) it can cache objects under
the heuristics rules. 2) It’s not a fixed setting, what it does is together with the “min”
setting is to get a range for the Expire time, the effective TTL will be a value between min-max,
based on how old the object is (based on Last-Modified). Since you set lm_factor=1.0, you
did effectively set min == max == 3 months. This seems very aggressive, and counteracts how
the heuristics system is supposed to work. The idea is that objects which are changing frequently
you should cache much shorter than those which change infrequently.

10. Your setting for proxy.config.cache.min_average_object_size seems wrong. If your average
object size is 32KB, you should set this to, hem, 32KB :). However, to give some headroom,
my personal recommendation is to 2x the number of directory entries, so set the configuration
to 16KB.

The math is mostly correct, except the calculation for "Disk Cache Object Capacity” is in
fact the max number of directory entries the cache can hold. Each object on disk consumes
*at least* one directory entry, but can consume more (amc, what’s our current guideline
here?). 

11. The text around proxy.config.cache.mutex_retry_delay is confusing. Setting this higher
would increase latency, not reduce it, at the expense of possibly consuming more CPU. If you
experience something different, I think it’d be worthwhile to file a Jira.

12. The descriptions of proxy.config.hostdb.timeout is wrong. This setting is used in conduction
with proxy.config.hostdb.ttl_mode. By default (0), the TTL from the DNS entry is obeyed, and
then hostdb.timeout has no meaning whatsoever. If set to 2 or 3, the timeout setting can be
used to override what the server said. In almost all case, you should set proxy.config.hostdb.ttl_mode
to 0, and then there’s no reason to muck with the hostdb.timeout. 

Note that hostdb.timeout does *not* say anything about flushing or evicting entries from the
DNS cache. It can only be used to override what the TTL on the RR was.



If our documentation disagrees with what I’m saying above, please file bugs against the
documentation.

Cheers,

— leif


Mime
View raw message