lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christoph Goller <gol...@detego-software.de>
Subject CompoundFileReader
Date Thu, 16 Oct 2003 19:02:19 GMT
Dear Dmitry,

I finally found time to look into your compound file implementation and
to try it out. It works and I like it. However, I wondered, why you are
using clones of the base input stream in CompoundFileReader.CSInputStream.
The "problem" (actually it is not a real problem) I see is that you are
double buffering your input. Note that the clone has a buffer of 1024 bytes and
that CompoundFileReader.CSInputStream, which extends InputStream, also has
such a buffer. I think in the way you implemented it your input gets copied
into the base buffer and from there into the CompoundFileReader.CSInputStream
buffer before it is actually used. Please have a look at my patch. At first
one might think that my implementation is simpler but less efficient. This
is not the case. Actually it is even a little bit more efficient. Points to
think about:

*) stream is private to CompoundFileReader, so nobody else can use it, if not
explicitly granted access. CompoundFileReader uses stream only in its
constructor and when calling openFile.

*) Therefore, CompoundFileReader.CSInputStream can share the same instance of
stream if they take care of synchronization and seek.

*) An InputStream not necessarily has to implement seekInternal if readInternal
takes care of seeking on the real stream (see FSInputStream).

*) Synchronization of CompoundFileReader.CSInputStream.readInternal on base
does no harm since it is called only when the real file has to be accessed
and thi is synchronized anyway. However, this synchronizytion is necessary!

I tested my patch with your TestCompoundFile and everything seems fine.
I also tried it on one of my indices and searched on this index. It also
works. My impression from my tests is that my patch is a little bit faster
than your original version but there seems to be not much difference in
efficiency.

Christoph

Mime
View raw message