lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Lucene Index Encryption
Date Mon, 11 May 2009 23:17:11 GMT
On Mon, May 11, 2009 at 2:06 PM, Babak Farhang <farhang@gmail.com> wrote:

> I am not familiar with the details of CFS, but I didn't interpret
> Michael's comment to mean that there is actually any rewriting going
> on here. The problem here appears to be one of translating the
> encrypted/compressed file position to the uncompressed file position.
> Am I reading this right?

Actually, CompoundFileWriter does seek back and overwrite bytes it had
previously written.

TermInfosWriter also does the same thing.

And ChecksumIndexOutput, currently used only when writing the
segments_N file, does as well.

If we could fix all these places, eg by separately storing this
metadata eg in the segments file, then we could deprecate & remove
IndexOutput.seek entirely, which would be a nice simplification.

> If in fact so, then a simple solution would be to push down all the
> encoding logic into the RAF implementation itself.  The "append-only"
> RAF implementation would maintain a decoded view of the file.  This
> decoded view would include the (virtual) decoded file position.  In
> that case, CFS could be oblivious to the actual RAF implementation.

Right, as long as the IndexOutput API implements "getFilePointer()"
such that I can take that returned value, and later pass it to
IndexInput.seek and it takes me back to the same spot, then that's all
Lucene would need.  Ie Lucene should never assume the long returned by
getFilePointer is the actual byte offset in the file.  Instead, it's a
value private to the IndexOutput/Input impl.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message