lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael J. Prichard" <michael_prich...@mac.com>
Subject Re: Document design and analyzer questions?
Date Tue, 13 Jun 2006 18:56:39 GMT
Hey Chris,

Thanks for the response.

Chris Hostetter wrote:

>: Question is two fold.  One, here is the layout I was thinking:
>
>my rule of thumb: if a field is going to contain less then a few dozen
>bytes (ie: a date, an email address, etc) you might as well store it ...
>it will make your life easier when looking at your results.
>
>  
>
I will have millions of entries in my index.  Would storing them cause 
any performance issues?

>another important thing you should consider is field norms: they don't
>make sense for most date fields or numeric fields, or fields where the
>length is fairly irrelevant (ie: email addresses, guids, document types)
>
>  
>
What do you mean?

>: Also, any recommendations on what analyzer to use?  I was thinking the
>: synonym analyzer based on the one in the Lucene in Action book.
>
>you are probably going to want to use PerFieldAnalyzer so you can use a
>differnet analyzer for the fields that store email addresses then the
>analyzer you use for the body text ... if i'm searching for
>emails from "bob@car.com" i dont want it matching on emails from
>"bob@automobile.com"
>
>  
>
Sounds good.  I guess I could just use a StandardAnalyzer on these fields.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message