pagespeed-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joshua Marantz <jmara...@google.com.INVALID>
Subject Re: pagespeed css request times out with 404 while optimizing
Date Thu, 12 Oct 2017 15:03:28 GMT
This bug report sent to mod-pagespeed-discuss, referencing code from
instaweb_handler.cc:

void InstawebHandler::HandleAsPagespeedResource() {
  ....
  if (ResourceFetch::BlockingFetch(stripped_gurl_, server_context_, driver,
                                   callback)) {
   ...
  } else {
    server_context_->ReportResourceNotFound(original_url_, request_);
  }
  ...
}

BlockingFetch has an undocumented timeout (default 5 seconds, settable in
pagespeed.conf).  There is a TODO to document it...
   ModPagespeedBlockingFetchTimeoutMs 10000
would set it to 10 seconds.

If a .pagespeed. resource can't be fetched and rewritten in the timeout, a
404 is returned and the web page breaks.  We should, for non-combined
resources, either redirect to the origin resource, or just serve the origin
resource with private/300 TTL.  I think a temp redirect would be easier to
implement.

For combined resources, I think we should have a separate timeout, with a
much higher default value. Another option is for CSS is to respond with a
body containing CSS @import statements for the components, but I'm not sure
if that's technically correct 100% of the time.  And for Combined JS and
Sprites that would be a lot harder.  So I think it might be better to plumb
in a higher timeout for combined resources, but ultimately we'd have to
respond with an error if the timeout is exceeded.  And there's a question
of rewritten/combined css files.  The outer-most pagespeed URL encoding
will look like it has a single input, but transitively it's a combined
resource and could not be solved with a redirect.

I thought in fact we *had* code somewhere for serving origin content for
single-input .pagespeeed. resources.  But maybe that was for a different
server?

@aroman two more questions:
   1.   You know your way around our code.  You have a great testcase,
With our guidance, do you want to take a shot at doing the fix yourself?
   2.   Now that PageSpeed is in Apache incubation, the right mailing list
to subscribe to is dev@pagespeed.incubator.apache.org, which you can
subscribe to by sending mail to
         dev-subscribe@pagespeed.incubator.apache.org

-Josh

On Wed, Oct 11, 2017 at 10:47 PM, Joshua Marantz <jmarantz@google.com>
wrote:

> Nicely diagnosed.  This sounds like something I'm going to have to dig
> into.  Three questions:
>   1. Do you know if this issue is new for 1.12?  Code related to this
> changed, I think, between 1.11 and 1.12.
>   2. Can you work around this short term with:
>         ModPagespeedDisallow *megahuge.css
>   3. Would put this into a bug report on https://github.com/pagespee
> d/mod_pagespeed/issues
>
> Thanks,
> -Josh
>
> On Wed, Oct 11, 2017 at 5:48 PM, <aroman@webscalenetworks.com> wrote:
>
>> I've been hunting down 404 problems with css files for a while now and I
>> finally have it nailed down.  I originally thought it was related to
>> sharding (and to be sure, we had problems with our configuration there),
>> but now I'm pretty sure it's a bug in pagespeed:
>>
>> Version: 1.12.34.2
>>
>> While optimizing a css file that references many other images, pagespeed
>> is 404'ing subsequent requests for that resource.
>>
>> Specifically:
>> * Starting pagespeed with a clean cache.
>> * Make a request for A.megahuge.css.pagespeed.cf.Ut0KGaDYUK.css
>> * Pagespeed fetches the original megahuge.css and starts downloading and
>> optimizing the dependent resources.
>> * The original request for A.megahuge.css.pagespeed.cf.Ut0KGaDYUK.css
>> times out after a few ms and returns the original resource with a short
>> cache expiration.
>> --- so far, so good ---
>> * A second request for A.megahuge.css.pagespeed.cf.Ut0KGaDYUK.css comes
>> in.  That gets stuck waiting in ResourceFetch::BlockingFetch for the
>> callback to complete.
>> * Pagespeed continues working on optimizing the dependent resources from
>> the first request.
>> * After 5 seconds, the BoundedWaitFor call in
>> ResourceFetch::BlockingFetch gives up, and pagespeed returns a 404 (!) for
>> the resource.
>>
>> In this case, the css file takes several minutes to optimize, so we have
>> tons of these 404's until somebody gets lucky and the CDN caches the
>> optimized result.
>>
>> It's even worse if the css file takes so long to optimize that pagespeed
>> decides to refresh the content before serving the optimized result.
>>
>>
>> So, I've traced the code through and through to nail it down to
>> this BoundedWaitFor call.  How can I fix the problem?  I want the same
>> behavior as the original request: If the callback hasn't completed within a
>> few ms, then return the un-optimized resource.
>>
>> - Augusto
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "mod-pagespeed-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to mod-pagespeed-discuss+unsubscribe@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/ms
>> gid/mod-pagespeed-discuss/87d9d410-3c23-4095-b79a-4faea20f36
>> 72%40googlegroups.com
>> <https://groups.google.com/d/msgid/mod-pagespeed-discuss/87d9d410-3c23-4095-b79a-4faea20f3672%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message