www-infrastructure-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nicholas Chammas (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (INFRA-10999) Usage guidelines + standards for mirrors
Date Sat, 26 Dec 2015 19:45:49 GMT

     [ https://issues.apache.org/jira/browse/INFRA-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Nicholas Chammas updated INFRA-10999:
-------------------------------------
    Description: 
1. Are there any concrete guidelines on how best to access the Apache mirror network? If so,
where are they?

For example, I've gathered from scattered conversations here and there that:

  * The "correct" way to configure automated downloads from the Apache mirror network is to
use the `closer.lua` script to automatically select a close mirror, and download from that.
Don't hardcode a specific mirror into any scripts.
  * The `closer.cgi` script should not be used as it is superseded by the `closer.lua` script.
  * I can append `?asjson` (or `?as_json`) when querying `closer.lua` to get some detail about
the best mirror in JSON, which is useful if I want to parse and use that information in a
script.

It was extremely time consuming to piece together all this basic information about how to
be a good user of the Apache mirror network. It shouldn't be this hard to do the right thing.

Is there no central, maintained documentation I can reference to get this kind of information?

2. Does the Apache foundation have any partnerships with a CDN like Fastly or with a large
company like Amazon to make it much easier and faster for users to download Apache software?

The Python community, for example, is offered hosting for Python packages by Fastly. I wonder
if Apache has ever considered seeking a similar kind of partnership (or sponsorship) to complement
its current mirror network.

3. I see that mirrors are checked for availability, but are there any rough requirements for
mirror performance? Something like, "You should not serve files slower than this"? I don't
mean anything crazy; just a lower limit on performance that should be easy to meet.

For example, the mirror located at 104.45.233.178 works fine. It's up and it serves files,
and `closer.lua` tells me it's close by. However, it consistently takes 20-30 minutes to serve
`hadoop-2.7.1.tar.gz`, a 200 MB file. Is that OK? Other mirrors (like http://mirrors.gigenet.com/)
serve this file in 5 minutes.

Because of the huge variance in mirror performance, I'm considering adding logic to my application
to query `closer.lua` but then check an "Apache mirror blacklist" to make sure I don't get
these extremely slow mirrors.

  was:
1. Are there any concrete guidelines on how best to access the Apache mirror network? If so,
where are they?

For example, I've gathered from scattered conversations here and there that:

  * The "correct" way to configure automated downloads from the Apache mirror network is to
use the `closer.lua` script to automatically select a close mirror, and download from that.
Don't hardcode a specific mirror into any scripts.
  * The closer.cgi script should not be used as it is superseded by the `closer.lua` script.
  * I can append `?asjson` (or `?as_json`) when querying `closer.lua` to get some detail about
the best mirror in JSON, which is useful if I want to parse and use that information in a
script.

It was extremely time consuming to piece together all this basic information about how to
be a good user of the Apache mirror network. It shouldn't be this hard to do the right thing.

Is there no central, maintained documentation I can reference to get this kind of information?

2. Does the Apache foundation have any partnerships with a CDN like Fastly or with a large
company like Amazon to make it much easier and faster for users to download Apache software?

The Python community, for example, is offered hosting for Python packages by Fastly. I wonder
if Apache has ever considered seeking a similar kind of partnership (or sponsorship) to complement
its current mirror network.

3. I see that mirrors are checked for availability, but are there any rough requirements for
mirror performance? Something like, "You should not serve files slower than this"? I don't
mean anything crazy; just a lower limit on performance that should be easy to meet.

For example, the mirror located at 104.45.233.178 works fine. It's up and it serves files,
and `closer.lua` tells me it's close by. However, it consistently takes 20-30 minutes to serve
`hadoop-2.7.1.tar.gz`, a 200 MB file. Is that OK? Other mirrors (like http://mirrors.gigenet.com/)
serve this file in 5 minutes.

Because of the huge variance in mirror performance, I'm considering adding logic to my application
to query `closer.lua` but then check an "Apache mirror blacklist" to make sure I don't get
these extremely slow mirrors.


> Usage guidelines + standards for mirrors
> ----------------------------------------
>
>                 Key: INFRA-10999
>                 URL: https://issues.apache.org/jira/browse/INFRA-10999
>             Project: Infrastructure
>          Issue Type: Improvement
>          Components: Mirrors
>            Reporter: Nicholas Chammas
>            Priority: Minor
>
> 1. Are there any concrete guidelines on how best to access the Apache mirror network?
If so, where are they?
> For example, I've gathered from scattered conversations here and there that:
>   * The "correct" way to configure automated downloads from the Apache mirror network
is to use the `closer.lua` script to automatically select a close mirror, and download from
that. Don't hardcode a specific mirror into any scripts.
>   * The `closer.cgi` script should not be used as it is superseded by the `closer.lua`
script.
>   * I can append `?asjson` (or `?as_json`) when querying `closer.lua` to get some detail
about the best mirror in JSON, which is useful if I want to parse and use that information
in a script.
> It was extremely time consuming to piece together all this basic information about how
to be a good user of the Apache mirror network. It shouldn't be this hard to do the right
thing.
> Is there no central, maintained documentation I can reference to get this kind of information?
> 2. Does the Apache foundation have any partnerships with a CDN like Fastly or with a
large company like Amazon to make it much easier and faster for users to download Apache software?
> The Python community, for example, is offered hosting for Python packages by Fastly.
I wonder if Apache has ever considered seeking a similar kind of partnership (or sponsorship)
to complement its current mirror network.
> 3. I see that mirrors are checked for availability, but are there any rough requirements
for mirror performance? Something like, "You should not serve files slower than this"? I don't
mean anything crazy; just a lower limit on performance that should be easy to meet.
> For example, the mirror located at 104.45.233.178 works fine. It's up and it serves files,
and `closer.lua` tells me it's close by. However, it consistently takes 20-30 minutes to serve
`hadoop-2.7.1.tar.gz`, a 200 MB file. Is that OK? Other mirrors (like http://mirrors.gigenet.com/)
serve this file in 5 minutes.
> Because of the huge variance in mirror performance, I'm considering adding logic to my
application to query `closer.lua` but then check an "Apache mirror blacklist" to make sure
I don't get these extremely slow mirrors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message