www-infrastructure-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nicholas Chammas (JIRA)" <j...@apache.org>
Subject [jira] [Created] (INFRA-10999) Usage guidelines + standards for mirrors
Date Thu, 24 Dec 2015 05:28:49 GMT
Nicholas Chammas created INFRA-10999:

             Summary: Usage guidelines + standards for mirrors
                 Key: INFRA-10999
                 URL: https://issues.apache.org/jira/browse/INFRA-10999
             Project: Infrastructure
          Issue Type: Improvement
          Components: Mirrors
            Reporter: Nicholas Chammas
            Priority: Minor

1. Are there any concrete guidelines on how best to access the Apache mirror network? If so,
where are they?

For example, I've gathered from scattered conversations here and there that:

  * The "correct" way to configure automated downloads from the Apache mirror network is to
use the `closer.lua` script to automatically select a close mirror, and download from that.
Don't hardcode a specific mirror into any scripts.
  * The closer.cgi script should not be used as it is superseded by the `closer.lua` script.
  * I can append `?asjson` (or `?as_json`) when querying `closer.lua` to get some detail about
the best mirror in JSON, which is useful if I want to parse and use that information in a

It was extremely time consuming to piece together all this basic information about how to
be a good user of the Apache mirror network. It shouldn't be this hard to do the right thing.

Is there no central, maintained documentation I can reference to get this kind of information?

2. Does the Apache foundation have any partnerships with a CDN like Fastly or with a large
company like Amazon to make it much easier and faster for users to download Apache software?

The Python community, for example, is offered hosting for Python packages by Fastly. I wonder
if Apache has ever considered seeking a similar kind of partnership (or sponsorship) to complement
its current mirror network.

3. I see that mirrors are checked for availability, but are there any rough requirements for
mirror performance? Something like, "You should not serve files slower than this"? I don't
mean anything crazy; just a lower limit on performance that should be easy to meet.

For example, the mirror located at works fine. It's up and it serves files.
However, it consistently takes 20-30 minutes to serve `hadoop-2.7.1.tar.gz`--a 200 MB file.
Is that OK? Other mirrors (like http://mirrors.gigenet.com/) serve this file in 5 minutes.

Because of the huge variance in mirror performance, I'm considering adding logic to my application
to query `closer.lua` but then check an "Apache mirror blacklist" to make sure I don't get
these extremely slow mirrors.

This message was sent by Atlassian JIRA

View raw message