lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From robert engels <reng...@ix.netcom.com>
Subject Re: Attached proposed modifications to Lucene 2.0 to support Field.Store.Encrypted
Date Sat, 02 Dec 2006 22:50:42 GMT
I think the point of the discussion is really to determine the answer  
to #1.

I would counter that it is not a compelling feature for MOST users of  
Lucene, but it can still be implemented externally using binary  
fields for those that require it, and or even easier (and maybe even  
faster) using a encrypted filesystem with proper security.

Adding it to the core Lucene complicates the code base, and I do not  
believe it is warranted.

This is only my opinion.

On Dec 2, 2006, at 2:38 PM, negrinv wrote:

>
> At the contrary Mike, I am beginning to think that there have been  
> a number
> of misunderstandings, of my original posting to start with.
> When I submitted my proposal I was prepared for some discussion on the
> merits  or otherwise of my proposed solution. I had no idea that the
> discussion would drift towards security and performance in absolute  
> terms. I
> would like now to steer the debate in its intended direction.
>
> I have no difficulty agreeing with you on both counts. A non- 
> encrypted swap
> file is a security risk, and encryption imposes a performance  
> penalty. Both
> of which I submit are not relevant to my posting for the following  
> reasons.
> Security is all about knowing where you stand so you can take
> counter-measures, it is not about a "false sense of security"  
> provided by
> knowing you have an encrypted swap file or a 3000 byte encryption key.
> Lucene cannot provide security. It would be a legal nightmare and  
> an absurd
> expectation. The underlying operating system within which Lucene  
> runs does
> not guarantee security, the encryption software provider does not  
> guarantee
> security, password protection and physical security are also  
> outside of
> Lucene's control. What Lucene can do is to provide encryption  
> services,
> while the application has to provide a given level of security. For
> instance, if you run under an operating system which does not  
> provide swap
> file encryption, then you must disable the swap file. Does that  
> impose a
> performance penalty? Probably, if your memory is limited, but now  
> you know
> where you stand so you make a decision. Performance or encrytpion  
> or more
> memory. But one cannot, in my view, shift the responsability for that
> decision to Lucene.
> I'll give you another example, you mentioned padding of 128 bits.  
> True,
> there are encryption routines which impose that penalty. For my  
> (initial)
> implementation I had the choice between an algorythm with padding,  
> or RC4,
> which does not pad. A 10 character term remains a 10 character term  
> after
> encryption. No padding and no index size implications. I said so in my
> posting and as an application developer you then have a choice to  
> make. Use
> Lucene RC4 encryption as proposed (for the time being) or use another
> product, or write your own. Without knowing the application, any  
> decision
> would be totally out of context, and no one piece of software can  
> satisfy
> all applications. A possible solution would be for Lucene to offer  
> a choice
> of algorythms.
>
> The army I am sure would like to run its tanks at the speed of a  
> Ferrary,
> but it cannot, it hits a wall known as cost-benefit ratio. It must  
> choose
> between security and speed and budget, keeping in mind the  
> application. The
> modern tank is the answer. A compromise.
> My original posting avoided the notion of security and performance in
> absolute terms precisely because of all the above considerations,  
> it simply
> addressed a couple of points which need to be resolved before the  
> specifics
> of the implementation can be discussed.
>
> 1) is it a good idea to have ancryption added to Lucene? I think so
> obviously, but not everyone agrees. As was pointed out in this  
> discussion,
> some relational database software provides encryption at the column  
> level, a
> functionality equivalent to the one I proposed. Lucene in some ways  
> competes
> with relational databases.
>
> 2) assuming the answer to 1) above is yes, how should one go about  
> including
> encryption in Lucene. My solution is just that, one approach.  
> Others have
> proposed directory or file system encryption. My view on this is  
> that this
> level of encryption is already provided by all major operating  
> systems, as
> well a by some hardware devices. I would not see a justifiable  
> benefit in
> adding it to Lucene. But that is only my personal opinion, although  
> I am
> aware that directory encryption is in the hands of the system  
> administrator,
> not the application end user. Perhaps there are other options which  
> have not
> been raised yet.
>
> 3) assuming my proposal is acceptable, can it be implemented  
> better. I am
> not a Lucene expert, I learned Lucene on the go. I would be  
> delighted to see
> a better solution presented, it would be a learning experience for me.
>
> I hope I have not added to the confusion.
>
> Season's greetings to you and to all who took time to participate  
> in this
> discussion.
> Victor
>
> Robert Engels wrote:
>>
>> I think you misunderstood. If you do not have encrypted swap (like
>> OSX provides for) then you encryption is pointless as anyone can
>> inspect the data as it it loaded into the heap by lucene - bypassing
>> the encryption.
>>
>> I also think you underestimated the impact on the size of the
>> indexes, as most secure encryption schemes are going to pad the
>> payloads to a minimum of 128 bits, and usually much more.
>>
>> This is going to make a HUGE difference in the size of the index.
>>
>> On Dec 1, 2006, at 2:00 PM, negrinv wrote:
>>
>>>
>>> Good news for OSX users! but what about all the others, should I
>>> say the
>>> majority??
>>> One more reason for encrypting at field level.
>>> Victor
>>>
>>>
>>> Robert Engels wrote:
>>>>
>>>> Not if running under OSX with encrypted swap turned on ! :)
>>>>
>>>> -----Original Message-----
>>>>> From: Nicolas Lalev�e <nicolas.lalevee@anyware-tech.com>
>>>>> Sent: Dec 1, 2006 4:49 AM
>>>>> To: java-dev@lucene.apache.org
>>>>> Subject: Re: Attached proposed modifications to Lucene 2.0 to
>>>>> support
>>> Field.Store.Encrypted
>>>>>
>>>>> Le Vendredi 1 D�cembre 2006 11:10, negrinv a �crit�:
>>>>>> Nicolas Lalev�e-2 wrote:
>>>>>>> Le Vendredi 1 D�cembre 2006 01:33, negrinv a �crit :
>>>>>>>> Thank you Robert for your commnets. I am inclined to agree
>>>>>>>> with you,
>>>>>> but
>>>>>>>> I
>>>>>>>> would like to establish first of all if simplicity of
>>>>>>>> implementation
>>>>>> is
>>>>>>>> the
>>>>>>>> overriding consideration. But before I dwell on that let
me
>>>>>>>> say that
>>>>>> i
>>>>>>>> have
>>>>>>>> discovered that I am not a master of DIFF file creation with
>>>>>>>> Eclipse.
>>>>>>>> The diff file attachement to my original posting is absurdly
>>>>>>>> large
>>>>>> and
>>>>>>>> not correct. I have therefore attached a zip file containing
 
>>>>>>>> the
>>>>>>>> complete source code of the classes I modified. I leave it
to
>>>>>>>> others
>>>>>> to
>>>>>>>> extract the
>>>>>>>> diffs properly.
>>>>>>>> Back to the issue. So far the implementation has not been
>>>>>>>> difficult
>>>>>>>> considering that I knew nothing about Lucene internals before
I
>>>>>> started.
>>>>>>>> The reason is that Lucene is very well structured and the
 
>>>>>>>> changes
>>>>>> just
>>>>>>>> fitted nicely by adding some code in the right place with
 
>>>>>>>> minimal
>>>>>>>> changes to the existing code. But I admit that the proposed
>>>>>>>> implementation so far is not complete and more work is
>>>>>>>> required to
>>>>>>>> overcome some of its restrictions. While I like your idea
I
>>>>>>>> believe
>>>>>> that
>>>>>>>> it imposed too large a
>>>>>>>> granularity on the encrypted data, all fields will all kinds
>>>>>>>> of data
>>>>>>>> will be encrypted including  images and others which normally
>>>>>>>> would
>>>>>> be
>>>>>>>> left alone, thus adding to the performance penalty due to
>>>>>>>> encryption.
>>>>>>>
>>>>>>> I don't agree with you here. In Lucene, you will encrypt the
 
>>>>>>> field
>>>>>> data,
>>>>>>> the
>>>>>>> field names, and the tokens : I would say that is represents
at
>>>>>>> least
>>>>>> 2/3
>>>>>>> of
>>>>>>> the index size. Then, with the implementation you suggest, I
 
>>>>>>> think
>>>>>> (sorry
>>>>>>> I
>>>>>>> didn't took time to see you patch) that every time a lucene
>>>>>>> data need
>>>>>> to
>>>>>>> be
>>>>>>> read, it is decrypted each time. With an encrypted FS, your 

>>>>>>> kernel
>>>>>> will
>>>>>>> maintain a cache in RAM for you, so it won't hurt so much.
>>>>>>> It needs some bench to see what is effectively the best, but
I
>>>>>>> have
>>>>>> doubt
>>>>>>> that
>>>>>>> your solution will be faster.
>>>>>>>
>>>>>>> Nicolas.
>>>>>>
>>>>>> Nicolas, I am all in favour of some tests to establish which
>>>>>> solution is
>>>>>> best, but I have to say that I don't believe file system or
>>>>>> directory
>>>>>> encryption in Lucene is really justified. Most operating system
>>>>>> already
>>>>>> provide this feature, although they are system-wide or policy- 
>>>>>> based
>>>>>> solution, hence not always within individual user control.
>>>>>> But if the issue is user control, then I believe Lucene should
>>>>>> provide
>>>>>> maximum granularity when it comes to choice of data to encrypt.
>>>>>> The issue I believe is whether some form of encryption should be
>>>>>> provided
>>>>>> within Lucene to enable application developers to create
>>>>>> applications
>>>>>> which
>>>>>> offer some data protection under user control, with a minimum of
>>>>>> impact,
>>>>>> where by impact I mean both on peformance and workload either in
>>>>>> Lucene
>>>>>> code or user code.
>>>>>
>>>>> In fact you mean a user that has no control of it's machine, and
>>>>> that
>>> cannot
>>>>> encrypt his partition. Here you will have the issue with the
>>>>> swap : Lucene
>>>>> will decrypt the data in RAM, that can possibly pushed on the
>>>>> swap... I
>>> know
>>>>> this is extreme, but it's a security hole.
>>>>>
>>>>> -- 
>>>>> Nicolas LALEV�E
>>>>> Solutions & Technologies
>>>>> ANYWARE TECHNOLOGIES
>>>>> Tel : +33 (0)5 61 00 52 90
>>>>> Fax : +33 (0)5 61 00 51 46
>>>>> http://www.anyware-tech.com
>>>>>
>>>>> ------------------------------------------------------------------ 
>>>>> --
>>>>> -
>>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------- 
>>>> --
>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>
>>>>
>>>>
>>>
>>> -- 
>>> View this message in context: http://www.nabble.com/Attached-
>>> proposed-modifications-to-Lucene-2.0-to-support-
>>> Field.Store.Encrypted-tf2727614.html#a7645198
>>> Sent from the Lucene - Java Developer mailing list archive at
>>> Nabble.com.
>>>
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>>
>
> -- 
> View this message in context: http://www.nabble.com/Attached- 
> proposed-modifications-to-Lucene-2.0-to-support- 
> Field.Store.Encrypted-tf2727614.html#a7657011
> Sent from the Lucene - Java Developer mailing list archive at  
> Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message