lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gregor Heinrich" <Gregor.Heinr...@igd.fhg.de>
Subject RE: Newbie Questions
Date Tue, 26 Aug 2003 19:18:18 GMT
Hi Mark.

Sorry, it's rc1 really which is out. But if you go to the cvs server, then
you'll find the rc2-dev version.

Multiple calls to Document.add with the same field results in that "their
text is treated as though appended for the purposes of search." (API doc).

Can you try out if there's a differece between the cases you mention? I don'
t know but I'd be interested as well;-).

Gregor




-----Original Message-----
From: Mark Woon [mailto:morpheus@helix.stanford.edu]
Sent: Tuesday, August 26, 2003 8:52 PM
To: Lucene Users List
Subject: Re: Newbie Questions


Gregor Heinrich wrote:

> ad 1: MultiFieldQueryParser is what you might want: you can specify the
> fields to run the query on. Alternatively, the practice of duplicating
> the
> contents of all separate fields in question into one additional merged
> field
> has been suggested, which enables you to use QueryParser itself.
>

Ah, I've been testing out something similar to the latter.  I've been
adding multiple values on the same key.  Won't this have the same
effect?  I've been assuming that if I do

doc.add(Field.Keyword("content", "value1");
doc.add(Field.Keyword("content", "value2");

And did a search on the "content" field for either value, I'd get a hit,
and it seems to work.  This way, I figure I'd be able to differentiate
between values that I want tokenized and values that I don't.

Is there a difference between this and building a StringBuffer
containing all the values and storing that as a single field-value?


> ad 2: Depending on the Analyzer you use, the query is normalised, i.e.,
> stemmed (remove suffices from words) and stopword-filtered (remove highly
> frequent words). Have a look at StandardAnalyzer.tokenStream(...) to
> see how
> the different filters work. In the analysis package the 1.3rc2 Lucene
> distribution has a Porter stemming algorithm: PorterStemmer.
>

There's an rc2 out?  Where??  I just checked the Lucene website and only
see rc1.


Thanks everyone for all the quick responses!

-Mark



Mime
View raw message