lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Engels <reng...@ix.netcom.com>
Subject Re: Attached proposed modifications to Lucene 2.0 to support Field.Store.Encrypted
Date Fri, 01 Dec 2006 17:34:12 GMT
I agree with Nicolas.

I think the overhead of decrypting such small payloads (I think it is also subject to an easy
attack, and/or will increase index size dramtically in order to prevent such small encryption
blocks) will have a serious impact on performance.

We use Lucene for indexing only and store the actual payloads elsewhere, so in our case your
solution is not optimal for us.
-----Original Message-----
>From: Nicolas Lalevée <nicolas.lalevee@anyware-tech.com>
>Sent: Dec 1, 2006 2:20 AM
>To: java-dev@lucene.apache.org
>Subject: Re: Attached proposed modifications to Lucene 2.0 to support Field.Store.Encrypted
>
>Le Vendredi 1 Décembre 2006 01:33, negrinv a écrit :
>> Thank you Robert for your commnets. I am inclined to agree with you, but I
>> would like to establish first of all if simplicity of implementation is the
>> overriding consideration. But before I dwell on that let me say that i have
>> discovered that I am not a master of DIFF file creation with Eclipse. The
>> diff file attachement to my original posting is absurdly large and not
>> correct. I have therefore attached a zip file containing the complete
>> source code of the classes I modified. I leave it to others to extract the
>> diffs properly.
>> Back to the issue. So far the implementation has not been difficult
>> considering that I knew nothing about Lucene internals before I started.
>> The reason is that Lucene is very well structured and the changes just
>> fitted nicely by adding some code in the right place with minimal changes
>> to the existing code. But I admit that the proposed implementation so far
>> is not complete and more work is required to overcome some of its
>> restrictions. While I like your idea I believe that it imposed too large a
>> granularity on the encrypted data, all fields will all kinds of data will
>> be encrypted including  images and others which normally would be left
>> alone, thus adding to the performance penalty due to encryption.
>
>I don't agree with you here. In Lucene, you will encrypt the field data, the 
>field names, and the tokens : I would say that is represents at least 2/3 of 
>the index size. Then, with the implementation you suggest, I think (sorry I 
>didn't took time to see you patch) that every time a lucene data need to be 
>read, it is decrypted each time. With an encrypted FS, your kernel will 
>maintain a cache in RAM for you, so it won't hurt so much.
>It needs some bench to see what is effectively the best, but I have doubt that 
>your solution will be faster.
>
>Nicolas.
>
>> Many 
>> hardware devices and most operating systems already provide directory or
>> file system encryption therefore that level of encryption appears to me an
>> unnecessary addition to Lucene. Encryption at field level however is not
>> provided by anything I know. The key in my opinion is to decide what is
>> best from the end user point of view, but perhaps we need more discussion
>> on this.
>> Victor
>>
>> http://www.nabble.com/file/4390/LuceneEncryptionMods.zip
>> LuceneEncryptionMods.zip
>>
>> Robert Engels wrote:
>> > I think a simpler solution would be to create a EncryptedDirectory
>> > implementation of Directory, which requires a password to open/modify the
>> > directory.
>> >
>> > Far simpler, and if yuou are using encryption to begin with, you are
>> > probably encrypting most of the data anyway.
>> >
>> > -----Original Message-----
>> >
>> >>From: negrinv <victornegrin@gmail.com>
>> >>Sent: Nov 29, 2006 9:45 PM
>> >>To: java-dev@lucene.apache.org
>> >>Subject: Re: Attached proposed modifications to Lucene 2.0 to support
>>
>> Field.Store.Encrypted
>>
>> >>Thank you Luke for your comments and the references you supplied. I read
>> >>through them and reached the following conclusions. There seems to be a
>> >>philosophical issue about the boundary between a user application and the
>> >>Lucene API, where should one start and the other stop.
>> >>The other issue is the significant difference between compression and
>> >>encryption.
>> >>As far as the first issue is concerned it is really a matter of personal
>> >>choice and preference. My feeling is that as long as adding functionality
>> >>does not impair the performance of the API as a whole, it makes sense to
>>
>> add
>>
>> >>it to Lucene and thus simplify the task of the application developer.
>>
>> After
>>
>> >>all, application developers do not have to use all the features of the
>> >> API and always have the option of subclassing, writing a better version
>> >> of it
>>
>> if
>>
>> >>they can, or writing the functionality as part of the application, even
>> >> if the API provides that functionality already. The API is there to make
>> >> life easier for those developers who want to use it, nobody "has" to use
>> >> it. The second issue is more technical. Compression simply compresses
>> >> the
>>
>> stored
>>
>> >>data to save storage. The index itself is not compressed therefore
>>
>> searching
>>
>> >>proceeds as normal. With encryption however you must encrypt the index as
>> >>well as the stored data otherwise one could reconstruct the source
>>
>> document
>>
>> >>from the index and thus defeat the purpose of encryption. Correct me if I
>>
>> am
>>
>> >>wrong, but I think that encrypting the Lucene index is not easy to
>> >> achieve from outside of Lucene, it implies re-writing as part of the
>> >> application much code now part of Lucene (see issue number one above),
>> >> hence my preference for including it as part of the Lucene API rather
>> >> than as part
>>
>> of
>>
>> >>the application.
>> >>Victor
>> >>
>> >>Luke Nezda wrote:
>> >>> I think that adding encryption support to Lucene fields is a bad idea
>> >>> for
>> >>> the same reasons adding compression was a bad idea (conclusive comments
>> >>> on
>> >>> the tail of this  issue
>> >>> http://issues.apache.org/jira/browse/LUCENE-648?page=all).  Binary
>> >>> fields
>> >>> can be used by users to achieve this end.  Maybe a contrib with utility
>> >>> methods would be a compromise to preserve this work and make it
>> >>> accessible
>> >>> to others, or alternatively just a faq entry with the sample code or
>> >>> references to it.
>> >>> Luke
>> >>>
>> >>> On 11/29/06, negrinv <victornegrin@gmail.com> wrote:
>> >>>> Attached are proposed modifications to Lucene 2.0 to support
>> >>>> Field.Store.Encrypted.
>> >>>> The rational behind this proposal is simple. Since Lucene can store
>> >>>> data
>> >>>> in
>> >>>> the index, it effectively makes the data portable. It is conceivable
>> >>>> that
>> >>>> some of the data may be sensitive in nature, hence the option to
>> >>>> encrypt
>> >>>> it.
>> >>>> Both the data and its index are encrypted in this implementation.
>> >>>> This is only an initial implementation. It has the following several
>> >>>> restrictions, all of which can be resolved if required, albeit with
>> >>>> some
>> >>>> effort and more changes to Lucene:
>> >>>> 1) binary and compressed fields cannot be encrypted as well (a
>> >>>> plaintext
>> >>>> once encrypted becomes binary).
>> >>>> 2) Field.Store.Encrypted implies Field.Store.Yes
>> >>>> This makes sense but it forces one to store the data in the same
index
>> >>>> where
>> >>>> the tokens are stored. It may be preferable at times to have two
>> >>>> indeces,
>> >>>> one for tokens, the other for the data.
>> >>>> 3) As implemented, it uses RC4 encryption from BouncyCastle. This
is
>> >>>> an open
>> >>>> source package, very simple to use which has the advantage of
>> >>>> guaranteeing
>> >>>> that the length of the encrypted field is the same as the original
>> >>>> plaintext. As of Java 1.5 (5.0) Sun provides an RC4 equivalent in
its
>> >>>> Java
>> >>>> Cryptography Extension, but unfortunately not in Java 1.4.
>> >>>> The BouncyCastle RC4 is not the only algorythm available, others
not
>> >>>> depending on third party code can be used, but it was just the
>> >>>> simplest to
>> >>>> implement for this first attempt.
>> >>>> 4) The attachements are modifications in diff form based on an early
>> >>>> (I think August or September '06) repository snapshot of Lucene
2.0
>> >>>> subsequently updated from the Lucene repository on 29/11/06. They
may
>> >>>> need
>> >>>> some additional work to merge with the latest version in the Lucene
>> >>>> repository. They also include a couple of JUnit test programs which
>> >>>> explain,
>> >>>> as well as test, the usage. You will need the BouncyCastle .jar
>> >>>> (bcprov-jdk14-134.jar) to run them. I did not attach it to minimize
>> >>>> the size
>> >>>> of the attachements, but it can be downloaded free from:
>> >>>> http://www.bouncycastle.org/latest_releases.html
>> >>>>
>> >>>> 5) Searching an encrypted field is restricted to single terms, no
>> >>>> phrase
>> >>>> or
>> >>>> boolean searches allowed yet, and the term has to be encrypted by
the
>> >>>> application before searching it. (ref. attached JUnit test programs)
>> >>>>
>> >>>> To the extent that I have tested it, the code works as intended
and
>> >>>> does
>> >>>> not
>> >>>> appear to introduce any regression problems, but more testing by
>> >>>> others would be desirable.
>> >>>> I don't propose at this stage to do any further work with this API
>> >>>> extensions unless there is some expression of interest and direction
>> >>>> from
>> >>>> the Lucene Developers team. I have an application ready to roll
which
>> >>>> uses
>> >>>> the proposed Lucene encryption API additions (please see
>> >>>> http://www.kbforge.com/index.html). The application is not yet
>> >>>> available
>> >>>> for
>> >>>> downloading simply because I am not sure if the Lucene licence allows
>> >>>> me
>> >>>> to
>> >>>> do so. I would appreciate your advice in this regard. My application
>> >>>> is free
>> >>>> but its source code is not available (yet). I should add that
>> >>>> encryption
>> >>>> does not have to be an integral part of Lucene, it can be just part
of
>> >>>> the
>> >>>> end application, but somehow it seems to me that Field.Store.Encrypted
>> >>>> belongs in the same category as compression and binary values.
>> >>>> I would be happy to receive your feedback.
>> >>>>
>> >>>> victor negrin
>> >>>>
>> >>>> http://www.nabble.com/file/4376/luceneDiff2.txt luceneDiff2.txt
>> >>>> http://www.nabble.com/file/4377/TestEncryptedDocument.java
>> >>>> TestEncryptedDocument.java
>> >>>> http://www.nabble.com/file/4378/TestDocument.java TestDocument.java
>> >>>> --
>> >>>> View this message in context:
>> >>>> http://www.nabble.com/Attached-proposed-modifications-to-Lucene-2.0-to
>> >>>>-support-Field.Store.Encrypted-tf2727614.html#a7607415 Sent from
the
>> >>>> Lucene - Java Developer mailing list archive at Nabble.com.
>> >>>>
>> >>>>
>> >>>> ---------------------------------------------------------------------
>> >>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> >>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >>
>> >>--
>> >>View this message in context:
>>
>> http://www.nabble.com/Attached-proposed-modifications-to-Lucene-2.0-to-supp
>>ort-Field.Store.Encrypted-tf2727614.html#a7613046
>>
>> >>Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>> >>
>> >>
>> >>---------------------------------------------------------------------
>> >>To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> >>For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> > For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>-- 
>Nicolas LALEVÉE
>Solutions & Technologies
>ANYWARE TECHNOLOGIES
>Tel : +33 (0)5 61 00 52 90
>Fax : +33 (0)5 61 00 51 46
>http://www.anyware-tech.com
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-dev-help@lucene.apache.org
>




---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message