cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Per Kreipke" <>
Subject RE: Speedup *DirectoryGenerator (e.g. ImageDirectoryGenerator et al)...
Date Fri, 14 Jun 2002 18:10:43 GMT
> *DirectoryGenerators should be refactored so we have the only
> DirectoryGenerator with pluggable 'processors' of different file types.
> This way, you will be able to generate listings of different files of
> type in one directory.

That's a great idea but more grandiose. It certainly would be neat if you
could (use POI to) extract metadata from MS Office files, etc. I imagine
there are actually code libraries out there for all kinds of 'file
introspection' or generating metadata from files.

> > - having getSize() call getFileType() and then getJpegSize() or
> > getGifSize(), introduces nice modularity but sacrifices speed. Each
> function
> > in that sequence calls (that's two calls total):
> >
> >   new BufferedInputStream(new FileInputStream(file));
> >
> > Instead, instantiate the BufferedInputStream in getSize() and pass it
> to the
> > other functions. Or move the work from getFileType() and get*Size()
> back in
> > to getSize().
> Instantiate one instance of RandomAccessFile and pass it to 'processor'.

Ok. This is re: the pluggable framework you mentioned above or does this
apply to the current code too?

> > - more importantly, caching the information from getSize() plus
> > 'lastModified' in an internal hash table with the file's URL as key
> would
> > remove the need to do the expensive work each time. If the file hasn't
> > changed, then it's size (or MP3 info) hasn't either.
> Cache key should be directory name plus settings, such as depth and
> masks.
> Cache validity should be TimestampCacheValidity (FileTimeStampValidity
> in Cocoon 2.1) of all files selected by given depth/masks in this
> directory.

I think you missed my point, those suggestions apply to caching the entire
result, no?

I'm not trying to cache the entire result for reasons listed in the thread:
"Cachability (was RE: XInclude Transformer vs CInlude Transformer". I'm just
trying to cache each file's metadata individually.


key (lastModified, width, height)

d:\files\per\foo.jpeg: (123456789, 100, 50)
d:\files\per\bar.gif: (987654321, 200, 100)

Since the lastModified date is already computed by DirectoryGenerator, it
knows whether or not to dive into the file to re-get the metadata. This is a
precursor to your plug in architecture too: there's no reason to re-get the
info if the file hasn't been modified.

> > Unfortunately, I don't know Cocoon well enough to understand if
> Generators
> > are global instances (so that all requests will share the hash table)
> or
> > whether it exists per pipeline, per sitemap, etc. My point: I'm not
> sure how
> > to implement the cached info correctly.
> Implement generateKey and generateValidity methods.

Right, but that's only for caching the entire results.


To unsubscribe, e-mail:
For additional commands, email:

View raw message