lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael J. Prichard" <>
Subject Re: Document design and analyzer questions?
Date Tue, 13 Jun 2006 18:56:39 GMT
Hey Chris,

Thanks for the response.

Chris Hostetter wrote:

>: Question is two fold.  One, here is the layout I was thinking:
>my rule of thumb: if a field is going to contain less then a few dozen
>bytes (ie: a date, an email address, etc) you might as well store it ...
>it will make your life easier when looking at your results.
I will have millions of entries in my index.  Would storing them cause 
any performance issues?

>another important thing you should consider is field norms: they don't
>make sense for most date fields or numeric fields, or fields where the
>length is fairly irrelevant (ie: email addresses, guids, document types)
What do you mean?

>: Also, any recommendations on what analyzer to use?  I was thinking the
>: synonym analyzer based on the one in the Lucene in Action book.
>you are probably going to want to use PerFieldAnalyzer so you can use a
>differnet analyzer for the fields that store email addresses then the
>analyzer you use for the body text ... if i'm searching for
>emails from "" i dont want it matching on emails from
Sounds good.  I guess I could just use a StandardAnalyzer on these fields.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message