lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <>
Subject Re: Encryption at lucene index
Date Mon, 07 Aug 2017 15:47:07 GMT
Encryption in Solr has a bunch of ramifications. Do you care about

- encryption at rest or in memory?
- encrypting the _searchable_ tokens?
- encrypting the searchable tokens per-user?
- encrypting the stored data (which a filter won't do BTW).

It's actually a fairly complex topic the discussion at LUCENE-6966
outlines much of it. Please ask specific questions as you research the
topic. One  per-user encryption package that I know of is by Hitachi
Solutions (commercial) and it explicitly does _not_ support, for
instance, wildcards (there are other limitations too). See:

Most of the time when people ask for encryption they soon discover
it's much more difficult than they imagine and settle for just putting
the indexes on an encrypting file system. When they move beyond that
it gets complex and you'd be well advised to consult with Solr
security experts.


On Sun, Aug 6, 2017 at 11:30 PM, Kumaran Ramasubramanian
<> wrote:
> Hi All,
> After looking at all below discussions, i have one doubt which may be silly
> or novice but i want to throw this to lucene user list.
> if we have encryption layer included in our analyzer's flow of filters like
> EncryptionFilter to control field-level encryption. what are the
> consequences ? am i missing anything basic?
> Thanks in advance..
> Related links:
> : AES Encrypted Directory
> - in lucene 3.x
> :  Codec for index-level
> encryption - at codec level, to have control on which column / field have
>  personal identifiable information
> A decent encrypting algorithm will not produce, say, the same first portion
>> for two tokens that start with the same letters. So wildcard searches won't
>> work. Consider "runs", "running", "runner". A search on "run*" would be
>> expected to match all three, but wouldn't unless the encryption were so
>> trivial as to be useless. Similar issues arise with sorting. "More Like
>> This" would be unreliable. There are many other features of a robust search
>> engine that would be impacted, and an index with encrypted terms would be
>> useful for only exact matches, which usually results in a poor search
>> experience.
> --
> Kumaran R

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message