ant-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 20103] - FileSet horrible performance when dir has huge number of subdirs
Date Wed, 04 Jun 2003 15:02:51 GMT
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=20103>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=20103

FileSet horrible performance when dir has huge number of subdirs





------- Additional Comments From ddevienne@lgc.com  2003-06-04 15:02 -------
Here's my take on FileSet optimization that meshes with the problem we are 
personnally seing here.

Problem: We have a large source base (5000+ files) which is scanned repeatedly 
to extract just a subset of these files, which are usally one or more subtrees 
of the fileset:

in somepath/src, we select:
  com/acme/foo/**
  com/acme/bar/**

The fileset must be declared with dir="somepath/src" to keep the proper 
relative filename (corresponding to the package name for a javac task), but 
obviously only somepath/src/com/acme/foo and somepath/src/com/acme/bar need to 
be scanned, and not the full somepath/src.

These repeated full scans of somepath/src to extract a subset of the sources 
are adding up to minutes...

Solution: I can think of two. I simple one, and a more difficult one.

Simple Solution: add a <searchroot name="com/acme/foo" /> subelement (which can 
be specified more than once) which explicitly tell the FileSet which directory 
it should scan rather than the one specified in the fileset's dir attribute.

Harder Solution: Infer the searchroots from the patterns themselves... Not 
impossible, but difficult.

The advantage of the simple solution is that is works with the use of 
selectors, since its explicit, so the build file writer knows that the selector 
s/he uses do not affect the searchroots.

This of course doesn't solve the other performance problem of FileSet, when 
used with a long list of explicit filenames without patterns.

Thanks for reading, --DD

Mime
View raw message