ant-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 20103] - FileSet horrible performance when dir has huge number of subdirs
Date Wed, 04 Jun 2003 15:26:31 GMT
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=20103>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=20103

FileSet horrible performance when dir has huge number of subdirs





------- Additional Comments From e17x-o9j5@spamex.com  2003-06-04 15:26 -------
> 1) add support for FileList in tasks accepting FileSet
> this is a lot of work. In principle it is not a risky approach, but all tasks 
> having to do with <fileset> would be impacted.

If the performance problem was fixed, you could always use FileSets instead of
Filelists. It is very confusing that you have both of those concepts IMO.

> This one can be changed, so that if no wildcards are present in the include 
> patterns, and the FileSet does not contain selectors, instead of recursively 
> listing the root directory to match each file or directory against includes, 
> excludes and selectors, the included files would be simply checked against 
> exclude patterns if any.

You should be able to scan the FileSet and produce a the set of directories
that need to be scanned.

I thought that you could just take a lazy-load approach where you load
the directories on demand.  You would look at the base directories of each
selector and then only load the directories necessary.

**         => scan all subdirectories of the fileset dir
subdir/**  => scan all subdirectories of subdir
subdir     => scan only subdir
filename   => no directory scanning

As the FileSet selectors are processed, the directories are scanned as
necessary. The results would be cached so that if other selectors require
those directories, they don't need to be rescanned.

I'm not familiar with the exclude selectors. Is it possible to exclude using
a wildcard pattern? That might require directories to be scanned also.

Mime
View raw message