trafficserver-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sudheer Vinukonda <>
Subject Re: thundering herd best practises
Date Fri, 10 Jul 2015 02:36:37 GMT
I've updated the settings and the feature description in the relevant places. Also, it looks
like these are available in 6.0.0 (and are not in 5.3.x).



     On Thursday, July 9, 2015 10:44 AM, Miles Libbey <> wrote:

 Thanks Sudheer-I read through the comments in TS-3549, but I don't grok what we are supposed
to do in ATS 5.3x+ to get the almost Stale While Revalidate configured. Seems like this would
be a great place to modify -- HTTP Proxy Caching — Apache Traffic Server 6.0.0 documentation (and
probably also need any new options in records.config — Apache Traffic Server 6.0.0 documentation
|   |
|   |   |   |   |   |
| HTTP Proxy Caching — Apache Traffic Server 6.0.0 documentationFuzzy Revalidation¶ Traffic
Server can be set to attempt to revalidate an object before it becomesstale in cache. records.config
contains the settings:  |
|  |
| View on | Preview by Yahoo |
|  |
|   |

|   |
|   |   |   |   |   |
| records.config — Apache Traffic Server 6.0.0 documentationrecords.config The records.config
file (by default, located in/usr/local/etc/trafficserver/) is a list of configurable variables
used bythe Traffic Server software.  |
|  |
| View on | Preview by Yahoo |
|  |
|   |


     On Thursday, July 9, 2015 7:57 AM, Sudheer Vinukonda <> wrote:

 There's no way to completely avoid multiple concurrent requests to the origin, without using
something like the SWR (Stale-While-Revalidate) solution. You may want to take a look at Stale-While-Revalidate-in-the-core. 
ATS 5.3.x+ supports an almost-SWR like solution with TS-3549. A complete SWR solution (in
the core ATS) is planned to be implemented with [TS-3587] Support stale-while-revalidate
in the core - ASF JIRA. There are a number of timers and other settings that are relevant
to the issues you mentioned (e.g TS-3622).
If you absolutely do not care about latency, you may try the existing stale-while-revalidate plugin.
I've not used it myself (we have an internal more efficient version of the same plugin) but,
I've heard that, the plugin doesn't work as desired.
(PS: you may need to be careful, since with read-while-write, we've experienced requests taking
longer than 60 sec+, without the above optimizations, which is absolutely ridiculous for any
kind of request, let alone the HLS use case).


     On Thursday, July 9, 2015 4:17 AM, Mateusz Zajakala <> wrote:

 Hi everyone,

I'd like to get some insight into how I can configure and fine-tune ATS to eliminate flooding
origin server with requests on TCP_MISS and to make sure I undestand what I'm doing.

I hope this is the right place to ask :)

Case: we have origin server serving HLS video chunks + playlists. What this means for ATS
- we know that exactly every request is cacheable
- expiry time for playlists is very short (10s), video chunks a little longer (this is set
by origin)
- we know the size of objects (1-2MB per video file)
- we do all of our caching in RAM 

We use ATS as reverse proxy with the following records config:
CONFIG proxy.config.http.cache.required_headers INT 0
- does this make ATS cache everything? 
CONFIG proxy.config.cache.enable_read_while_writer INT 1
- we don't want to wait until chunk is served to one client, we want to serve them in parallel
CONFIG proxy.config.http.background_fill_active_timeout INT 0
CONFIG proxy.config.http.background_fill_completed_threshold FLOAT 0.000000
- accoring to docs these allow download to cache to finish if client that initiated disconnects
CONFIG proxy.config.http.cache.max_open_read_retries INT 5
CONFIG proxy.config.http.cache.open_read_retry_time INT 100
- this is KEY - we need to have collapsed forwarding!
CONFIG proxy.config.cache.ram_cache.size INT 20G
- put everything in RAM

All others are defaults. Now with these settings we are getting a respectable 99,1% hit ratio.
However there are cases when increasing the number of incoming requests to ATS causes it to
flood origin on TCP_MISS (origin responds with 200, so if-modified-since is not part of the

Now, I would imagine that setting max_open_read_retries + open_read_retry_time would make
ALL clients requesting a file (but the first one) wait until the first one retrieves headers
and because of enable_read_while_writer they would then serve the retrieved file. However
I'm seeing in that sometimes during 100ms or more there are multiple TCP_MISS and
origin server requests for the same file. I tried tweaking values of open_read timeout and
retries but without sucess.

Request serving time on TCP_MISS is usually less than 10ms. We have a good link to origin.

My goal would be to have a "perfect" collapsed forwarding. I don't care about latency (I can
make client wait even 5s if necessary), but I don't want to hit origin. Is this possible?
Do I need to adjust the settings? Or is there some reason that this cannot be achieved on
high number of requests?

I would greatly appreciate any suggestions!


Ps. We are using CentOS 6 + Epel 6 official ATS 3.0.4 (ancient!) on 40-core, 64-GB RAM machine
with 2x10Gbps eth. No observable load problems with >1K requests /s.



View raw message