trafficserver-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mateusz Zajakala <zajak...@gmail.com>
Subject Re: thundering herd best practises
Date Fri, 10 Jul 2015 16:19:41 GMT
Still, your comments are very helpful and much appreaciated! Your
explanation is interesting, however contrary to my expectations of "open
read retry".

Docs state:
"While an object is being fetched from the origin server, subsequent
requests would wait proxy.config.http.cache.open_read_retry_time
<http://trafficserver.readthedocs.org/en/latest/reference/configuration/records.config.en.html#proxy-config-http-cache-open-read-retry-time>
milliseconds before checking if the object can be served from cache. If the
object is still being fetched, the subsequent requests will retry
proxy.config.http.cache.max_open_read_retries
<http://trafficserver.readthedocs.org/en/latest/reference/configuration/records.config.en.html#proxy-config-http-cache-max-open-read-retries>
times."

So I'd expect the second Txn to see that there is write lock (so object is
being fetched) and WAIT - not go to origin. You say however, that the
second Thx will be successful in obtaining the read lock (because "dirent"
is available, what is dirent?). This could explain the leakage, but then I
don't understand under what circumstances "open_read_retry" would kick in
(if at all)...

On Fri, Jul 10, 2015 at 6:07 PM, Sudheer Vinukonda <sudheerv@yahoo-inc.com>
wrote:

> Here's my understanding based on what I've noticed in my code reading and
> tests:
>
> When a request is received, the Txn (transaction) associated with it,
> first tries a cache open read (basically, a simple lookup for the dirent).
> If the open read fails (on a cache miss), the Txn tries a open write
> (basically, gets the write lock for the object) and goes onto the origin to
> download the object. At this point the dirent for the object is created and
> the write lock held by this Txn.
>
> If a second request comes in at this point, the Txn associated with it
> tries an open read, and, it doesn't fail (since, the dirent is already
> available). However, then the object in cache is not in a state to kick
> read-while-writer in yet. Without the write lock, the Txn would then,
> simply disable cache and goes to the origin.  The logic for a cache stale
> is more or less similar.
>
> This is where the new feature "open_write_fail_action" comes into play, to
> either return an error (or a stale copy, if it's available). We haven't
> experimented with the cache_open_fail_max_write_retries and perhaps, that
> might make things better too.
>
>
> Thanks,
>
> Sudheer
>
> *Disclaimer: I'm *not* an expert on ATS cache internals, so, I could well
> be stating something that may not be entirely accurate.*
>
>
>
>
>   On Friday, July 10, 2015 8:37 AM, Mateusz Zajakala <zajakala@gmail.com>
> wrote:
>
>
> Thanks Sudheer!
>
> However, I'm still not sure about what happens under the hood. Let's say
> we have 2 clients requesting a file for the first time.
>
> 1) client 1, TCP_MISS, go to origin
> 2) very soon after - client 2, TCP_MISS. Now, if 1) already managed to get
> the headers, then we can serve the file ( read-while-writer ). But if NOT,
> then there should be open read, so we wait retry x timeout (I tried setting
> it to as much as 20 x 200 ms). During this time 1) should finish download
> of the file, or at least get the headers to allow read-while-writer.
> 3) same scenario as in 2) should apply to any other incoming client
> requests for the same file.
>
> Is this not the expected behaviour? Maybe I'm missing something, but it
> seems that after one connection starts retrieval of origin data others
> should not repeat this. However, with very high loads I still see leakage
> of requests to origin, and I'm not sure how exactly this happens.
>
> Could it happen because client 2 arrives after client 1, but still before
> client 1 managed to open read session to origin, so "open read" does not
> kick in? I have no idea how synchronization is done between multiple
> requests for the same file, but I imagine one of them has to start reading
> as the first one and this info would be available to others trying to read
> (and they would then be stopped on open_read_retry)?
>
>
>
>
> On Fri, Jul 10, 2015 at 5:12 PM, Sudheer Vinukonda <sudheerv@yahoo-inc.com
> > wrote:
>
> You may want to read through the below:
>
>
> https://docs.trafficserver.apache.org/en/latest/admin/http-proxy-caching.en.html#read-while-writer
>
> "*While some other HTTP proxies permit clients to begin reading the
> response immediately upon the proxy receiving data from the origin server,
> ATS does not begin allowing clients to read until after the complete HTTP
> response headers have been read and processed. This is a side-effect of ATS
> making no distinction between a cache refresh and a cold cache, which
> prevents knowing whether a response is going to be cacheable.*
>
> *As non-cacheable responses from an origin server are generally due to
> that content being unique to different client requests, ATS will not enable
> read-while-writer functionality until it has determined that it will be
> able to cache the object.*"
>
> As explained in that doc, read-while-writer doesn't get kicked in until
> the response headers for an object are received and validated. For a live
> streaming scenario, this leaves a tiny window large enough (due to the
> large number of concurrent requests) to leak more than a single request to
> the origin, despite enabling read-while-writer.
>
> The open read retry settings do help to reduce this problem to a large
> extent, by attempting to retry the read. There's also a setting
> <proxy.config.http.cache.max_open_write_retries> that can be tuned to
> further improve this situation.
>
>
> https://docs.trafficserver.apache.org/en/latest/admin/http-proxy-caching.en.html#open-read-retry-timeout
>
>
> However, despite all the above tuning, we still noticed multiple requests
> leaking (although significantly lower than without the tuning). Hence the
> need for the new feature Open Write Fail Action
> <https://docs.trafficserver.apache.org/en/latest/reference/configuration/records.config.en.html#proxy-config-http-cache-open-write-fail-action>.
> With this setting, you can configure to return a 502 error on a cache miss,
> but, when there's an ongoing concurrent request for the same object. This
> lets the client (player) reattempt the request, by when the original
> concurrent request would have filled the cache. With this feature, we don't
> see TCP_MISS more than once at any given instant for the same object
> anymore.
>
> Let me know if you have more questions.
>
>
> Thanks,
>
> Sudheer
>
>
>
>
>
>
>
>
>
>   On Friday, July 10, 2015 12:19 AM, Mateusz Zajakala <zajakala@gmail.com>
> wrote:
>
>
> Thanks for the explanation. While SWR does seem like a very useful feaure
> I don't think this can help in my specific case.
>
> In HLS the only object that expires often is the playlist manifest with
> very small size (hundreds of bytes). I don't think we're having a problem
> with revalidation of these files. However sometimes we are seeing origin
> flooded with requests for video segments (1-2 MB). These are never
> revalidatoins, according to squid.blog these are all TCP_MISS.
>
> Take for example the following log:
>
> 1436442291.878 60 10.10.99.112 TCP_MISS/200 668669 GET
> http://origin-server.example.com/ehls/video/20150703T123156-01-143602692.ts
> - DIRECT/origin-server.example.com video/m2pt -
> 1436442292.095 12 10.10.99.112 TCP_MISS/200 668669 GET
> http://origin-server.example.com/ehls/video/20150703T123156-01-143602692.ts
> - DIRECT/origin-server.example.com video/m2pt -
> 1436442292.133 17 10.10.99.112 TCP_MISS/200 668669 GET
> http://origin-server.example.com/ehls/video/20150703T123156-01-143602692.ts
> - DIRECT/origin-server.example.com video/m2pt -
>
> As you can see we have three following requests for the same file. Each of
> them takes a short time to process, they are separated in time, however all
> of them are TCP_MISS. With my setting I'd expect a TCP_MISS on the first
> retrieval, and then clean TCP_HITs. And this is how it usually works (even
> with high loads), only once in a while we see more requests getting through
> to origin. When this happens origin slows down, procesing time is longer,
> more requests are TCP_MISS and very soon we're killing origin with enormous
> traffic.
>
> Is there any way to avoid this? Shouldn't open_read_retry take care of
> this?
>
> I'm quite new to ATS and caching in general, so correct me if I
> misunderstood something..
>
> Thanks
> Mat
>
> On Fri, Jul 10, 2015 at 4:36 AM, Sudheer Vinukonda <sudheerv@yahoo-inc.com
> > wrote:
>
> I've updated the settings and the feature description in the relevant
> places. Also, it looks like these are available in 6.0.0 (and are not in
> 5.3.x).
>
>
> https://docs.trafficserver.apache.org/en/latest/reference/configuration/records.config.en.html#proxy-config-http-cache-open-write-fail-action
>
>
> https://docs.trafficserver.apache.org/en/latest/reference/configuration/records.config.en.html#proxy-config-cache-read-while-writer-max-retries
>
>
> https://docs.trafficserver.apache.org/en/latest/admin/http-proxy-caching.en.html#open-write-fail-action
>
> Thanks,
>
> Sudheer
>
>
>
>
>
>
>   On Thursday, July 9, 2015 10:44 AM, Miles Libbey <mlibbey@apache.org>
> wrote:
>
>
> Thanks Sudheer-
> I read through the comments in TS-3549
> <https://issues.apache.org/jira/browse/TS-3549>, but I don't grok what we
> are supposed to do in ATS 5.3x+ to get the almost Stale While Revalidate
> configured. Seems like this would be a great place to modify -- HTTP
> Proxy Caching — Apache Traffic Server 6.0.0 documentation
> <https://docs.trafficserver.apache.org/en/latest/admin/http-proxy-caching.en.html#reducing-origin-server-requests-avoiding-the-thundering-herd>
(and
> probably also need any new options in records.config — Apache Traffic
> Server 6.0.0 documentation
> <https://docs.trafficserver.apache.org/en/latest/reference/configuration/records.config.en.html>
>
>
>
>
>
>
> HTTP Proxy Caching — Apache Traffic Server 6.0.0 documentation
> <https://docs.trafficserver.apache.org/en/latest/admin/http-proxy-caching.en.html#reducing-origin-server-requests-avoiding-the-thundering-herd>
> Fuzzy Revalidation¶ Traffic Server can be set to attempt to revalidate an
> object before it becomes stale in cache. records.config contains the
> settings:
> View on docs.trafficserver.apache.org
> <https://docs.trafficserver.apache.org/en/latest/admin/http-proxy-caching.en.html#reducing-origin-server-requests-avoiding-the-thundering-herd>
> Preview by Yahoo
>
>
>
>
>
>
>
>
> records.config — Apache Traffic Server 6.0.0 documentation
> <https://docs.trafficserver.apache.org/en/latest/reference/configuration/records.config.en.html>
> records.config The records.config file (by default, located in
> /usr/local/etc/trafficserver/) is a list of configurable variables used by
> the Traffic Server software.
> View on docs.trafficserver.apache.org
> <https://docs.trafficserver.apache.org/en/latest/reference/configuration/records.config.en.html>
> Preview by Yahoo
>
>
> miles
>
>
>
>
>   On Thursday, July 9, 2015 7:57 AM, Sudheer Vinukonda <
> sudheerv@yahoo-inc.com> wrote:
>
>
> There's no way to completely avoid multiple concurrent requests to the
> origin, without using something like the SWR (Stale-While-Revalidate)
> solution. You may want to take a look at
> Stale-While-Revalidate-in-the-core
> <https://cwiki.apache.org/confluence/display/TS/Stale-While-Revalidate+in+the+core>
> .
>
> ATS 5.3.x+ supports an almost-SWR like solution with TS-3549
> <https://issues.apache.org/jira/browse/TS-3549>. A complete SWR solution
> (in the core ATS) is planned to be implemented with [TS-3587] Support
> stale-while-revalidate in the core - ASF JIRA
> <https://issues.apache.org/jira/browse/TS-3587>. There are a number of
> timers and other settings that are relevant to the issues you mentioned
> (e.g TS-3622 <https://issues.apache.org/jira/browse/TS-3622>).
>
> If you absolutely do not care about latency, you may try the existing
> stale-while-revalidate
> <https://github.com/apache/trafficserver/tree/master/plugins/experimental/stale_while_revalidate>
plugin.
> I've not used it myself (we have an internal more efficient version of the
> same plugin) but, I've heard that, the plugin doesn't work as desired.
>
> (PS: you may need to be careful, since with read-while-write, we've
> experienced requests taking longer than 60 sec+, without the above
> optimizations, which is absolutely ridiculous for any kind of request, let
> alone the HLS use case).
>
> Thanks,
>
> Sudheer
>
>
>
>
>
>
>
>
>   On Thursday, July 9, 2015 4:17 AM, Mateusz Zajakala <zajakala@gmail.com>
> wrote:
>
>
> Hi everyone,
>
> I'd like to get some insight into how I can configure and fine-tune ATS to
> eliminate flooding origin server with requests on TCP_MISS and to make sure
> I undestand what I'm doing.
>
> I hope this is the right place to ask :)
>
> Case: we have origin server serving HLS video chunks + playlists. What
> this means for ATS is:
> - we know that exactly every request is cacheable
> - expiry time for playlists is very short (10s), video chunks a little
> longer (this is set by origin)
> - we know the size of objects (1-2MB per video file)
> - we do all of our caching in RAM
>
> We use ATS as reverse proxy with the following records config:
> CONFIG proxy.config.http.cache.required_headers INT 0
> - does this make ATS cache everything?
> CONFIG proxy.config.cache.enable_read_while_writer INT 1
> - we don't want to wait until chunk is served to one client, we want to
> serve them in parallel
> CONFIG proxy.config.http.background_fill_active_timeout INT 0
> CONFIG proxy.config.http.background_fill_completed_threshold FLOAT 0.000000
> - accoring to docs these allow download to cache to finish if client that
> initiated disconnects
> CONFIG proxy.config.http.cache.max_open_read_retries INT 5
> CONFIG proxy.config.http.cache.open_read_retry_time INT 100
> - this is KEY - we need to have collapsed forwarding!
> CONFIG proxy.config.cache.ram_cache.size INT 20G
> - put everything in RAM
>
> All others are defaults. Now with these settings we are getting a
> respectable 99,1% hit ratio. However there are cases when increasing the
> number of incoming requests to ATS causes it to flood origin on TCP_MISS
> (origin responds with 200, so if-modified-since is not part of the request).
>
> Now, I would imagine that setting max_open_read_retries +
> open_read_retry_time would make ALL clients requesting a file (but the
> first one) wait until the first one retrieves headers and because of
> enable_read_while_writer they would then serve the retrieved file. However
> I'm seeing in squid.blog that sometimes during 100ms or more there are
> multiple TCP_MISS and origin server requests for the same file. I tried
> tweaking values of open_read timeout and retries but without sucess.
>
> Request serving time on TCP_MISS is usually less than 10ms. We have a good
> link to origin.
>
> My goal would be to have a "perfect" collapsed forwarding. I don't care
> about latency (I can make client wait even 5s if necessary), but I don't
> want to hit origin. Is this possible? Do I need to adjust the settings? Or
> is there some reason that this cannot be achieved on high number of
> requests?
>
> I would greatly appreciate any suggestions!
>
> Thanks
> Mateusz
>
> Ps. We are using CentOS 6 + Epel 6 official ATS 3.0.4 (ancient!) on
> 40-core, 64-GB RAM machine with 2x10Gbps eth. No observable load problems
> with >1K requests /s.
>
>
>
>
>
>
>
>
>
>
>
>
>

Mime
View raw message