trafficserver-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Payne <jp557...@gmail.com>
Subject Re: Trying to understand no-activity timeouts
Date Thu, 25 Jun 2020 16:55:54 GMT
even when using parent_is_proxy=false, parent timeouts follow:

proxy.config.http.parent_proxy.connect_attempts_timeout

tested this in my lab.

you may want to enable debug, then set your debug tags to:

parent|http


On Wed, Jun 24, 2020 at 8:15 PM Nick Dunkin <Nick.Dunkin@vecima.com> wrote:
>
> Hi Jeremy
>
> Thanks.
>
> For our “sanity check” test we just have a single Origin in parent.config.  On the
production system I believe we have two Origins.  Also for the “sanity check” test we
have both “attempts” params set to 1, expecting the shortest possible timeout with no
retries.
>
> I realize from your test description that I maybe need to clarify something.  We are
using parent config here to support Origin failover, so we have parent_is_proxy set to false
in our parent rules.
>
> So we have a Traffic Server instance with parent.config being used to specify Origin
failover rules.
>
> Sorry I didn’t mention that at the top.
>
> Thanks
>
> Nick
>
> Sent from my iPhone
>
> On Jun 24, 2020, at 8:59 PM, Jeremy Payne <jp557198@gmail.com> wrote:
>
> 
> yeah..  how many parents are listed in parent.config ?
>
> also, how many per parent retries do you have configured ?
> this is what i have in my configuration.
> proxy.config.http.parent_proxy.total_connect_attempts 4
> proxy.config.http.parent_proxy.per_parent_connect_attempts 2
> proxy.config.http.parent_proxy.connect_attempts_timeout 3
> my test is this.
>
> child - parent - origin
>
> connectivity to child and parent works.
> on the origin i have a file that sleeps 20s before returning headers or any data.
>
> my client sends a request to the child.
> the child establishes a connection to the parent.
> child sends a GET to the parent.
> parent establishes a connection to the origin.
> parent sends a GET to the origin.
> origin doesnt return data in 3s.
> the child tears down the parent connection(TCP FIN), re-establishes a connection to the
same parent, then sends a GET to the parent.
> parent re-establishes connection to the origin and parent sends a GET.
> again, no first-byte in 3s.
> child tears down parent connection.
> child now establishes another to another parent.. repeat.
> given my cofiguration this can last up to 12s.
>
> you may see the same if you run a tcpdump between child and parents.
> so given the combination of the above parameters may give the impression
> the timeout is not working.
> but its a combination of the timeout plus retries.
>
>
>
>
>
>
>
>
>
> On Wed, Jun 24, 2020 at 4:51 PM Nick Dunkin <Nick.Dunkin@vecima.com> wrote:
>>
>> Hi Jeremy,
>>
>>
>>
>> Thank you.  I’m sure the team has one from a previous test, and if not I’ll produce
another and provide it here.
>>
>>
>>
>> Thanks
>>
>>
>>
>> Nick
>>
>>
>>
>>
>>
>> From: Jeremy Payne <jp557198@gmail.com>
>> Reply-To: "users@trafficserver.apache.org" <users@trafficserver.apache.org>
>> Date: Wednesday, June 24, 2020 at 5:47 PM
>> To: "users@trafficserver.apache.org" <users@trafficserver.apache.org>
>> Subject: Re: Trying to understand no-activity timeouts
>> Resent-From: <Nick.Dunkin@ccur.com>
>>
>>
>>
>> do you have a packet trace between child and parent during this 20s ttfb ?
>>
>>
>>
>> ill retest, but in 7.x 'proxy.config.http.parent_proxy.connect_attempts_timeout'
usually applies to both connection setup and ttfb.
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Jun 24, 2020 at 3:54 PM Nick Dunkin <Nick.Dunkin@vecima.com> wrote:
>>
>> Hi Jeremy,
>>
>>
>>
>> We are currently on 7.1.4
>>
>>
>>
>> This issue seems to only occur when the Origin server does nothing other than accept
the connection.  We mocked up an Origin server and experimented with “trickling the data”
versus “waiting before sending the first byte” and the latter seems to be where we can’t
get the timeout behavior we desire.
>>
>>
>>
>> Many thanks
>>
>>
>>
>> Nick
>>
>>
>>
>> From: Jeremy Payne <jp557198@gmail.com>
>> Reply-To: "users@trafficserver.apache.org" <users@trafficserver.apache.org>
>> Date: Wednesday, June 24, 2020 at 3:57 PM
>> To: "users@trafficserver.apache.org" <users@trafficserver.apache.org>
>> Subject: Re: Trying to understand no-activity timeouts
>> Resent-From: <Nick.Dunkin@ccur.com>
>>
>>
>>
>> yes.. what version of ATS are you using ?
>>
>>
>>
>>
>>
>> On Wed, Jun 24, 2020 at 1:32 PM Nick Dunkin <Nick.Dunkin@vecima.com> wrote:
>>
>> Hi Jeremy,
>>
>>
>>
>> Thanks for the reply.
>>
>>
>>
>> We did try that, but it did not behave as we expected, we still experienced a long
response.  Have you used this parameter with parent routing with predictable results?
>>
>>
>>
>> Cheers
>>
>>
>>
>> Nick
>>
>>
>>
>> From: Jeremy Payne <jp557198@gmail.com>
>> Reply-To: "users@trafficserver.apache.org" <users@trafficserver.apache.org>
>> Date: Wednesday, June 24, 2020 at 8:16 AM
>> To: "users@trafficserver.apache.org" <users@trafficserver.apache.org>
>> Subject: Re: Trying to understand no-activity timeouts
>>
>>
>>
>>
>>
>> try setting this parameter.
>>
>>
>>
>> proxy.config.http.parent_proxy.connect_attempts_timeout
>>
>>
>>
>>
>>
>> On Tue, Jun 23, 2020 at 12:33 PM Nick Dunkin <Nick.Dunkin@vecima.com> wrote:
>>
>> Hi,
>>
>>
>>
>> We are still dealing with a particular kind of no-activity time out issue.
>>
>>
>>
>> We are dealing with an Origin that will occasionally take 20 seconds to return a
HTTP 500 (annoying, right).  We took a tcpdump and captured this occurring.  In the trace
we can see the /GET and the ACK, and then a full 20 seconds (approx) before the HTTP 500 comes
back.  Please see the below picture.
>>
>>
>>
>> <image001.jpg>
>>
>>
>>
>> To be clear, apart from accepting the connection, the Origin Server sends NOTHING
over the connection during the 20 seconds.
>>
>>
>>
>> Without Parent Routing
>>
>>
>>
>> This looks very much like something the Origin side “no-activity” timeouts should
cater for, so we set both of the following (for good measure) to 2 seconds, but we still see
exactly the same thing occurring.
>>
>>
>>
>> CONFIG proxy.config.http.transaction_active_timeout_out INT 2
>> CONFIG proxy.config.http.transaction_no_activity_timeout_out INT 2
>>
>>
>>
>> We managed to resolve this particular issue by using adding the following configuration,
which is a “timeout to first byte”.  Is this the correct configuration solution for dealing
with this issue?
>>
>>
>>
>> CONFIG proxy.config.http.connect_attempts_timeout INT 2
>>
>>
>>
>> This all seems to make sense based on the available documentation.  So far so good.
>>
>>
>>
>> With Parent Routing
>>
>>
>>
>> However, when we enable parent routing, and put the same single Origin Server in
parent.config, we DO NOT see the “timeout to first byte” being applied.  What are we missing
about these timeouts and how they interact with parent routing?
>>
>>
>>
>> This all seems to hinge on the fact that the Origin server does not send a single
byte for multiple seconds.    We see more predictable behavior if the Origin Server serves
any data before the 20 seconds hang.
>>
>>
>>
>> Very grateful for any insight.
>>
>>
>>
>> Regards,
>>
>>
>>
>> Nick Dunkin
>>
>>

Mime
View raw message