lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: Lucene Index Encryption
Date Tue, 05 May 2009 09:07:56 GMT
> Lucene needs to be able to ask a RAF opened for writing what it's
> current "position" is during indexing, which it then stores away, and
> later during searching it needs to ask a RAF opened for reading to
> seek back to that position so it can read bytes from there.  Would the
> encryption APIs allow this?

And this would be the problem, as depending on your encryption, the most
encryptions need a "stream" and cannot restart somewhere! E.g. Filesystem
encryptions encrypt block-wise.

What could be done in the directory implementation is to also encrypt chunks
and on each read/write access (write: re-read the chunk, decrypt, change
requested bytes, encrypt, store chunk; read: read the chunk, decrypt, return
bytes - as you see writing is very costly. So a caching needs to be
implemented in the Directory impl, but calling flush should write the
encrypted chunk). Another possibility is to copy the decrypted files into a
RAMDirectory. E.g. maybe use some algorithm like in this new recently added
Split-Directory.

> If this is possible then couldn't one make a Directory impl that hides
> all encryption/decryption "under the hood"?
> 
> (Presumably performance will suffer perhaps substantially since every
> search will need to decrypt on the fly...).
> 
> Mike
> 
> On Mon, May 4, 2009 at 6:29 PM,  <Peter_Lenahan@ibi.com> wrote:
> >
> > I hope to make this a discussion rather than a request for a feature.
> >
> > In the database world, secure data is always encrypted in the database.
> > Since I am interested in storing data from a database in the index, at
> > times I want to encrypt the index when the file is one disk.
> >
> > Currently data stored in the Lucene Index is easily accessible to any
> > program that wants to access it. You cannot store sensitive data in the
> > index without the fear that it will be readable by all the people that
> > have access to the system.
> >
> > There are two other posts in the mailing list that ask a question about
> > Lucene Index Encryption. In both cases, I think that the conservation
> > was dropped or the feature put off.
> >
> > Basically, I am asking for comments on the topic. I might consider
> > coding the feature, but I would only do it if I am sure that the feature
> > would be useful and accepted back into the core codebase of Lucene.
> >
> > The Sun javax.crypto package is available in the JDK 1.4 so using that
> > package could be possible way of providing an encrypted file.
> >
> > The other option is Bouncy Castle, which is now being used in the PDFBox
> > and Tika projects.
> >
> > In any case, because the normal Lucene Index implementation would not
> > use an encrypted index, all references to Security classes should load
> > dynamically with the "Class.forName()" method if they were not part of a
> > standard JRE, to guarantee no additional requirements are placed on
> > people currently using the Lucene libraries.
> >
> > Then there is the issue of what to use as the Encryption Key, and how to
> > allow access to the Index files from the various programs that may need
> > to get to the data. The Encryption Key needs to external from any
> > program that accesses the Index, because with Java, if the key were
> > stored in the code, it would be easily found with a simple decompile of
> > the Java class.
> >
> > I don't have answers to the questions, but basically I am requesting
> > comments on the topic.
> >
> > I imagine that if I put Encryption and Decryption at the I/O level,
> > immediately before a segment was written or immediately after a segment
> > was read, that I would minimize the overall impact of the Lucene
> > Library.
> >
> > Another area to address is Remote Searching. The Remote Interface would
> > need extensions that allow for Encrypted Remote files as well as
> > Encrypted communication between the machines.
> >
> > However, I am not sure of these assumptions. I don't know how many
> > places the segments are read and written. I really do not know how to do
> > this currently, but would be willing to give it a try it there was
> > enough interest shown in the topic.
> >
> >
> > Peter
> >
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message