maven-repo-maintainers mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brian E. Fox" <bri...@reply.infinity.nu>
Subject Artifactory 1.3 has been DOS'ing the Central repository
Date Fri, 28 Nov 2008 12:50:17 GMT
Since approximately Mid August, the load on Central has been growing at
an exponential rate. You may have noticed slowdowns or dropped
connections recently as a side effect. We first had issues with Apache
HTTPD load increasing above the capacity of the machine. We switched
over to Nginx
(http://blogs.sonatype.com/people/brian/2008/10/29/nginx-is-centrals-new
-friend/) and this resolved the load, but then the 100mbps connection
was regularly becoming saturated. Every hour, on the hour for about 20
minutes around the clock, the connection would max out and then return
to about 50% utilization. We spent many days working with Contegix to
diagnose the problem but no single source stood out immediately.

 

Yesterday we finally discovered that nearly all the traffic, both the
hourly spikes and the 50% background traffic is being caused by
downloads of the nexus-index.zip. After investigating the various tools
that use this data, we have concluded that Artifactory has a critical
bug ( apparently since June:
http://issues.jfrog.org/jira/browse/RTFACT-390) that is causing every
1.3 instance to repeatedly download the 27mb zip file. We found many
cases of a single ip downloading the index more than 1000 times a day!
In the config it is set as follows:

 

    <!-- The cron definition to control the activation of the m2eclipse
indexer. -->

    <indexer>

        <!-- By Default index every 5 hours -->

        <cronExp>0 0 /5 * * ?</cronExp>

    </indexer>

 

(this is a quartz syntax which is "s m h...")

 

This by itself wouldn't be a huge issue except for the fact that
Artifactory ignores the index.properties file which contains the last
update timestamp AND doesn't first issue a HEAD to check the timestamp.
This means that every Artifactory 1.3 instance is grabbing this 27mb
file at least every 5 hours (we can't explain why certain ips are doing
it 1000+ times a day..perhaps the config was modified there or some
other scheduling issue is present).

 

For reference, the index on central is only updated once a week on
Sundays. 

 

To protect the Maven Community from ongoing troubles, we have had to
take the extra-ordinary step of blocking all downloads of the index file
by Artifactory until this is resolved. Upon doing this, the traffic has
fallen to 10% of what it has been in the recent past. If you are using
Artifactory, please adjust this cron definition to run only weekly and
save yourself and us tons of wasted bandwidth and money. 

 

Note that other tools like the Nexus Maven Repository Manager
(http://nexus.sonatype.org), M2e and Q4e use the Nexus Indexer API and
are immune to this problem and are not blocked from downloading the
index. A new version of Nexus and the Nexus Indexer API will be
published soon along with M2e that will leverage incremental indexes to
significantly reduce the download requirements and allow near real time
index updates. 

 

Brian Fox

Apache Maven PMC

http://blogs.sonatype.org/people/brian


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message