lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Woon <>
Subject Re: Newbie Questions
Date Tue, 26 Aug 2003 18:51:30 GMT
Gregor Heinrich wrote:

> ad 1: MultiFieldQueryParser is what you might want: you can specify the
> fields to run the query on. Alternatively, the practice of duplicating 
> the
> contents of all separate fields in question into one additional merged 
> field
> has been suggested, which enables you to use QueryParser itself.

Ah, I've been testing out something similar to the latter.  I've been 
adding multiple values on the same key.  Won't this have the same 
effect?  I've been assuming that if I do

doc.add(Field.Keyword("content", "value1");
doc.add(Field.Keyword("content", "value2");

And did a search on the "content" field for either value, I'd get a hit, 
and it seems to work.  This way, I figure I'd be able to differentiate 
between values that I want tokenized and values that I don't.

Is there a difference between this and building a StringBuffer 
containing all the values and storing that as a single field-value?

> ad 2: Depending on the Analyzer you use, the query is normalised, i.e.,
> stemmed (remove suffices from words) and stopword-filtered (remove highly
> frequent words). Have a look at StandardAnalyzer.tokenStream(...) to 
> see how
> the different filters work. In the analysis package the 1.3rc2 Lucene
> distribution has a Porter stemming algorithm: PorterStemmer.

There's an rc2 out?  Where??  I just checked the Lucene website and only 
see rc1.

Thanks everyone for all the quick responses!


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message