ant-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicolas Lalevée <nicolas.lale...@hibnet.org>
Subject Re: A performance issue in 'restrict'
Date Tue, 15 Jun 2010 12:52:32 GMT
On Monday 14 June 2010 20:01:08 Matt Benson wrote:
> Hi, Nicolas:
>
> [SNIP]
>
> > I looked into the code, it seems
> > org.apache.tools.ant.types.resources.Restrict is the culprit. It uses
> > BaseResourceCollectionWrapper which loads the entire underlying resource
> > collection.
> > I searched for the use of BaseResourceCollectionWrapper in Ant and I
> > think that this performance issue may also affect
> > org.apache.tools.ant.types.resources.SizeLimitCollection and
> > org.apache.tools.ant.types.resources.Tokens.
> >
> > I started to think about a fix but it seems non trivial. If I'm not
> > mistaken we would have to implement an iterator which should deal with an
> > optionnal cache and concurrency.
>
> While it would be possible to reimplement Restrict's internal wrapper
> such that it would dynamically calculate the included resources while
> iterating, I don't see any way to do the same when the *size* of the
> collection is requested...

Yep if size is called then the whole collection has to be "fetched".
I just hope that in my use case where I just want the first element of the 
restrict, the whole resource collection won't be fetched. And as I keep 
looking into the code, it seems that the implementation of 
org.apache.tools.ant.types.resources.First does that. So size() shouldn't be 
an issue here.

> > I am also worried about the isFilesystemOnly implementation in Restrict.
> > Should it return true if actually selected resources are files, or return
> > true if the entire set of candidates are files ? In the first case it
> > would make the iterator useless.
>
> Typically the implementation of #isFilesystemOnly() in
> BaseResourceCollectionWrapper and BaseResourceCollectionContainer is a
> quick check whether the underlying collection/s is/are known to be
> entirely filesystem-based.  If not, each resource is checked in turn
> until one is found that is *not* filesystem-based.  The reason for the
> distinction is to provide a means of recognizing whether it should be
> possible to handle a given resourcecollection using a task that can
> only work with files.

So I am suggesting a little change on the implementation: removing the check 
isFilesystemOnly on the underlying resource collection; just keep the 
iteration on the actual resources of the underlying collection.
I don't think it will change the semantic of the function, right?
About performance, in my use case it should behave better as the <first> will 
only load one resource in the collection.
I think there could be a performance impact where a resource collection is 
known to be file system only, and it is used with a task that cannot support 
that. The resources will be loaded but not actually used as the build will 
fail.
So it seems a choice between:
 * useless iteration on some resources which can have high latency
 * useless iteration on some resources which should be on a filsytem and 
trigger a failed build.

I prefer the second choice. I might not be impartial though ;)

Nicolas


>
> -Matt
>
> > Side note: I didn't checked that piece of script in as our Hudson
> > instance doesn't have Ant 1.8 installed. Not yet. I'll ask for it.
> >
> > Nicolas
> >
> > [1] http://ant.apache.org/manual/Types/resources.html#resourcelist
> >
> > PS: I started recently to intensively use loadresource and all the
> > resource related stuff in my build scripts, it is amazingly porwerful !
> > It reminds me when I was 'pipelining' in cocoon ;)
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
> > For additional commands, e-mail: dev-help@ant.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
> For additional commands, e-mail: dev-help@ant.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org


Mime
View raw message