Return-Path: Delivered-To: apmail-xml-cocoon-dev-archive@xml.apache.org Received: (qmail 16800 invoked by uid 500); 14 Jun 2002 18:08:27 -0000 Mailing-List: contact cocoon-dev-help@xml.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: list-post: Reply-To: cocoon-dev@xml.apache.org Delivered-To: mailing list cocoon-dev@xml.apache.org Received: (qmail 16788 invoked from network); 14 Jun 2002 18:08:27 -0000 From: "Per Kreipke" To: Subject: RE: Speedup *DirectoryGenerator (e.g. ImageDirectoryGenerator et al)... Date: Fri, 14 Jun 2002 14:10:43 -0400 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0) X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4910.0300 In-Reply-To: <018b01c213bf$002521f0$0a00a8c0@vgritsenkopc> Importance: Normal X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N > *DirectoryGenerators should be refactored so we have the only > DirectoryGenerator with pluggable 'processors' of different file types. > This way, you will be able to generate listings of different files of > type in one directory. That's a great idea but more grandiose. It certainly would be neat if you could (use POI to) extract metadata from MS Office files, etc. I imagine there are actually code libraries out there for all kinds of 'file introspection' or generating metadata from files. > > - having getSize() call getFileType() and then getJpegSize() or > > getGifSize(), introduces nice modularity but sacrifices speed. Each > function > > in that sequence calls (that's two calls total): > > > > new BufferedInputStream(new FileInputStream(file)); > > > > Instead, instantiate the BufferedInputStream in getSize() and pass it > to the > > other functions. Or move the work from getFileType() and get*Size() > back in > > to getSize(). > > Instantiate one instance of RandomAccessFile and pass it to 'processor'. Ok. This is re: the pluggable framework you mentioned above or does this apply to the current code too? > > - more importantly, caching the information from getSize() plus > > 'lastModified' in an internal hash table with the file's URL as key > would > > remove the need to do the expensive work each time. If the file hasn't > > changed, then it's size (or MP3 info) hasn't either. > > Cache key should be directory name plus settings, such as depth and > masks. > > Cache validity should be TimestampCacheValidity (FileTimeStampValidity > in Cocoon 2.1) of all files selected by given depth/masks in this > directory. I think you missed my point, those suggestions apply to caching the entire result, no? I'm not trying to cache the entire result for reasons listed in the thread: "Cachability (was RE: XInclude Transformer vs CInlude Transformer". I'm just trying to cache each file's metadata individually. E.g.: key (lastModified, width, height) d:\files\per\foo.jpeg: (123456789, 100, 50) d:\files\per\bar.gif: (987654321, 200, 100) Since the lastModified date is already computed by DirectoryGenerator, it knows whether or not to dive into the file to re-get the metadata. This is a precursor to your plug in architecture too: there's no reason to re-get the info if the file hasn't been modified. > > Unfortunately, I don't know Cocoon well enough to understand if > Generators > > are global instances (so that all requests will share the hash table) > or > > whether it exists per pipeline, per sitemap, etc. My point: I'm not > sure how > > to implement the cached info correctly. > > Implement generateKey and generateValidity methods. Right, but that's only for caching the entire results. Per --------------------------------------------------------------------- To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org For additional commands, email: cocoon-dev-help@xml.apache.org