From Mike McNally>
Subject Ant 1.7 build, high CPU use, possible solution
Date Mon, 23 Apr 2007 22:16:30 GMT
I've been using Ant to build a fairly large web application (about
twelve thousand .class files and various other things in a multi-phase
build) for several years.  We've generally been pretty happy with Ant
performance.  Recently, however, we noticed that if we ran Ant 1.7 on
the same build configuration the process was very noticeably slower, and
involved long (many seconds) periods of high CPU load.  Through some
poking it appeared that the <copy> task was incurring the load, which
seemed odd.

So today I ran a build with JMP (Java Memory Profiler) just out of
curiosity, after having downloaded and built the source.  At the point
of the large copy, I noticed that the "contains" method on
java.util.ArrayList was being called very many times, and that that was
resulting in a tremendous number of calls to compare Resource instances.

It seems to me that the code in Union.getCollection() is the culprit.
For reasons unknown to me it builds up its result with an obviously n^2
loop to avoid duplicates in the result collection.  By experimentally
replacing the "ArrayList" with "LinkedHashSet", the excessive CPU load
is completely cured.

I realize that there might be some issue using LinkedHashSet, because
it's from a fairly recent Java version, but the code could also work by
just keeping an explicit java.util.HashSet around in parallel with the
ArrayList being assembled.

For comparison, a build on an already-built source tree (i.e., something
that just runs through the <copy> task without ending up doing anything)
takes 15 seconds with my "fixed" Ant 1.7, 45 seconds without the fix,
and 15 seconds with Ant 1.6.5.  (My machine is a fairly modern
dual-processor P4 with lots of memory, running Linux.)

(If this is a known and solved problem, my apologies; I can't find any
mention of it on the web.)

