Return-Path: Delivered-To: apmail-ant-dev-archive@www.apache.org Received: (qmail 9767 invoked from network); 20 Jan 2006 02:31:19 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 20 Jan 2006 02:31:19 -0000 Received: (qmail 38837 invoked by uid 500); 20 Jan 2006 02:31:18 -0000 Delivered-To: apmail-ant-dev-archive@ant.apache.org Received: (qmail 38782 invoked by uid 500); 20 Jan 2006 02:31:17 -0000 Mailing-List: contact dev-help@ant.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Help: List-Post: List-Id: "Ant Developers List" Reply-To: "Ant Developers List" Delivered-To: mailing list dev@ant.apache.org Received: (qmail 38770 invoked by uid 99); 20 Jan 2006 02:31:17 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Jan 2006 18:31:16 -0800 X-ASF-Spam-Status: No, hits=1.4 required=10.0 tests=DNS_FROM_RFC_ABUSE,HTML_10_20,HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of carej@us.ibm.com designates 32.97.110.153 as permitted sender) Received: from [32.97.110.153] (HELO e35.co.us.ibm.com) (32.97.110.153) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Jan 2006 18:31:15 -0800 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e35.co.us.ibm.com (8.12.11/8.12.11) with ESMTP id k0K2UqBO032534 for ; Thu, 19 Jan 2006 21:30:52 -0500 Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by westrelay02.boulder.ibm.com (8.12.10/NCO/VERS6.8) with ESMTP id k0K2TDJI236746 for ; Thu, 19 Jan 2006 19:29:13 -0700 Received: from d03av03.boulder.ibm.com (loopback [127.0.0.1]) by d03av03.boulder.ibm.com (8.12.11/8.13.3) with ESMTP id k0K2UqVc015620 for ; Thu, 19 Jan 2006 19:30:52 -0700 Received: from d03nm119.boulder.ibm.com (d03nm119.boulder.ibm.com [9.17.195.145]) by d03av03.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id k0K2Upew015615 for ; Thu, 19 Jan 2006 19:30:51 -0700 In-Reply-To: <8E08C7734517F147B0E60F065B5D7155012E054F@vcebe101.NOE.Nokia.com> To: "Ant Developers List" Subject: Re: Performance of fileset related operations with a large number of files MIME-Version: 1.0 X-Mailer: Lotus Notes Release 7.0 HF85 November 04, 2005 From: Jeffrey E Care Message-ID: Date: Thu, 19 Jan 2006 21:33:29 -0500 X-MIMETrack: Serialize by Router on D03NM119/03/M/IBM(Release 6.53HF654 | July 22, 2005) at 01/19/2006 19:33:36, Serialize complete at 01/19/2006 19:33:36 Content-Type: multipart/alternative; boundary="=_alternative 000DCEB4852570FC_=" X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N --=_alternative 000DCEB4852570FC_= Content-Type: text/plain; charset="US-ASCII" I don't profess to know the code as well as the committers, but I don't see how this would be feasible. First off, how would an iterative approach actually be faster? DirectoryScanner (or its iterative equivalent) would still have to do the same amount of work to select the files, so even though you're spreading the time out over the entire operation, it should still take approximately the same amount of time overall to perform the file selection. The only way I can think that an iterative approach would be faster is if you could write a fileset iterator that had a background thread making the selections, leaving the "main" thread to process the files. Of course, BC is always a concern as well. We certainly could not get rid of DirectoryScanner, or any of the its public methods. Another BC concern is that, IIRC, tells you how many files it's going to copy: an iterative implementation could not do that; we'd have to add an extra attribute to the copy task to use the iterative behavior, which would also then likely mean entirely different blocks of logic... Maybe instead of such a massive refactoring (which even if accepted would not be delivered in an official Ant driver for a long time) it would be better to some native processes? JEC -- Jeffrey E. Care (carej@us.ibm.com) WebSphere v7 Release Engineer WebSphere Build Tooling Lead (Project Mantis) wrote on 01/19/2006 07:18:02 PM: > Hello, > > I have been trialling Ant as a driver for a large scale build execution. > The preparation before the build involves copying and unzipping >100,000 > files spread across >20,000 directories. When using Ant's built in copy > task with filesets selecting large parts of these files, a long time is > spent building the list of files to copy, which also takes a lot of > memory. This is my understanding of how Ant works with filesets after > browsing the source. > > Is there any way to avoid this high memory usage and time spent building > a list? > > Has there ever been any consideration of refactoring the way Ant > processes filesets and similar constructs such that each selected file > is processed once read in an iterative fashion, rather than building a > complete list and then processing? > > thanks > > paul > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org > For additional commands, e-mail: dev-help@ant.apache.org > --=_alternative 000DCEB4852570FC_=--