lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Trejkaz <trej...@trypticon.org>
Subject Re: Email id tokenizer (actual email id & multiple terms)
Date Wed, 21 Dec 2016 06:23:46 GMT
On Wed, Dec 21, 2016 at 1:21 AM, Ahmet Arslan <iorixxx@yahoo.com.invalid> wrote:
> Hi,
>
> You can index whole address in a separate field.
> Otherwise, how would you handle positions of the split tokens?
>
> By the way, speed of phrase search may be just fine, so consider trying first.

Speed aside, phrase search is difficult because you'll accidentally
match too much.
(user@company.com will match user@company.com.au, john@gmail.com will
match little.john@gmail.com, etc.)

Using a separate field for non-tokenised addresses would be my
recommendation too.

TX

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message