cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vadim Gritsenko" <vadim.gritse...@verizon.net>
Subject RE: Speedup *DirectoryGenerator (e.g. ImageDirectoryGenerator et al)...
Date Fri, 14 Jun 2002 18:22:08 GMT
> From: Per Kreipke [mailto:per@onclave.com]
> 
> > *DirectoryGenerators should be refactored so we have the only
> > DirectoryGenerator with pluggable 'processors' of different file
types.
> > This way, you will be able to generate listings of different files
of
> > type in one directory.
> 
> That's a great idea but more grandiose. It certainly would be neat if
you
> could (use POI to) extract metadata from MS Office files, etc. I
imagine
> there are actually code libraries out there for all kinds of 'file
> introspection' or generating metadata from files.
> 
> > > - having getSize() call getFileType() and then getJpegSize() or
> > > getGifSize(), introduces nice modularity but sacrifices speed.
Each
> > > function
> > > in that sequence calls (that's two calls total):
> > >
> > >   new BufferedInputStream(new FileInputStream(file));
> > >
> > > Instead, instantiate the BufferedInputStream in getSize() and pass
it
> > > to the
> > > other functions. Or move the work from getFileType() and
get*Size()
> > > back in
> > > to getSize().
> >
> > Instantiate one instance of RandomAccessFile and pass it to
'processor'.
> 
> Ok. This is re: the pluggable framework you mentioned above or does
this
> apply to the current code too?

MP3 needs it: TAG is in the tail... You don't want to read *all* file,
right? :)


> > > - more importantly, caching the information from getSize() plus
> > > 'lastModified' in an internal hash table with the file's URL as
key
> > > would
> > > remove the need to do the expensive work each time. If the file
hasn't
> > > changed, then it's size (or MP3 info) hasn't either.
> >
> > Cache key should be directory name plus settings, such as depth and
> > masks.
> >
> > Cache validity should be TimestampCacheValidity
(FileTimeStampValidity
> > in Cocoon 2.1) of all files selected by given depth/masks in this
> > directory.
> 
> I think you missed my point, those suggestions apply to caching the
entire
> result, no?

Yes, that's to cache whole response.


> I'm not trying to cache the entire result for reasons listed in the
thread:
> "Cachability (was RE: XInclude Transformer vs CInlude Transformer".
I'm just
> trying to cache each file's metadata individually.
> 
> E.g.:
> 
> key (lastModified, width, height)
> 
> d:\files\per\foo.jpeg: (123456789, 100, 50)
> d:\files\per\bar.gif: (987654321, 200, 100)
> 
> Since the lastModified date is already computed by DirectoryGenerator,
it
> knows whether or not to dive into the file to re-get the metadata.
This is a
> precursor to your plug in architecture too: there's no reason to
re-get the
> info if the file hasn't been modified.

I see. Then, use Store. See XSLProcessor for example of component which
uses Store for its purposes.


> > > Unfortunately, I don't know Cocoon well enough to understand if
> > > Generators
> > > are global instances (so that all requests will share the hash
table)
> > > or
> > > whether it exists per pipeline, per sitemap, etc. My point: I'm
not
> > > sure how
> > > to implement the cached info correctly.
> >
> > Implement generateKey and generateValidity methods.
> 
> Right, but that's only for caching the entire results.

Yup.

Vadim


> Per


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Mime
View raw message