lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Document design and analyzer questions?
Date Wed, 31 May 2006 17:49:38 GMT

: Question is two fold.  One, here is the layout I was thinking:

my rule of thumb: if a field is going to contain less then a few dozen
bytes (ie: a date, an email address, etc) you might as well store it ...
it will make your life easier when looking at your results.

another important thing you should consider is field norms: they don't
make sense for most date fields or numeric fields, or fields where the
length is fairly irrelevant (ie: email addresses, guids, document types)

: Also, any recommendations on what analyzer to use?  I was thinking the
: synonym analyzer based on the one in the Lucene in Action book.

you are probably going to want to use PerFieldAnalyzer so you can use a
differnet analyzer for the fields that store email addresses then the
analyzer you use for the body text ... if i'm searching for
emails from "bob@car.com" i dont want it matching on emails from
"bob@automobile.com"



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message