lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Porter stemming problem
Date Fri, 22 Jun 2007 16:12:19 GMT
Yes, you should also stem the query terms. Otherwise, you'll have
indexed "working" as "work", but your search for "working" will look
for "working" and won't match. Which is not what you want, I'm sure.

Query.toString() will tell you a lot about how queries are
processed, BTW....

In general, unless you're very sure what the effects are, you should
use the same analyzer for indexing as you use for searching.

Best
Erick

On 6/22/07, Robert Walpole <robert.walpole@devon.gov.uk> wrote:
>
> Hi,
>
> I am using the PorterStemAnalyzer class (attached) to provide stemming
> for a Lucene index.
>
> To stem the terms in the index we use the following...
>
> //open an index writer in append mode
> IndexWriter idxWriter = new IndexWriter(LUCENE_INDEX_PATH, new
> PorterStemAnalyzer(), false);
>
> //add the lucene document to the index
> idxWriter.addDocument(idxDoc);
>
> Having inspected the index using Luke, I can confirm that the terms are
> being stemmed as expected. However, in order for this to work properly I
> am not clear whether I should also be stemming the search terms that are
> entered?
>
> For example there is a term "relax" in the index which I guess is
> stemmed from "relaxation". If the user searches on "relaxing" do I need
> to stem the search term in order for it to return the result?
>
> At the moment I am attempting to do this as follows...
>
> analyzer = new PorterStemAnalyzer();
> parser = new QueryParser("content", analyzer);
> Query query = parser.parse("keywords: relaxing");
> Hits hits = idxSearcher.search(query);
>
> ...but this is not returning any matches.
>
> Thanks
> Rob Walpole
> Devon Portal Developer
> Email robert.walpole@devon.gov.uk
> Web http://www.devonline.gov.uk
>
>
>
> <<PorterStemAnalyzer.java>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message