lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Lucene shows parts of search query as a HIT
Date Thu, 19 Jul 2007 13:07:50 GMT
You say there's only one document and you added many. The line

 IndexWriter writer = new IndexWriter(indexPath, new StandardAnalyzer(),
true);

blows away any existing index data and starts over. If you're calling
this fragment for each document, you'll always have only one doc. Try
changing the 'true' to 'false'. Or better yet, open the writer outside the
document add routines and pass it in...

I still have no clue why you're getting the results you are.
StandardAnalyzer
doesn't stem, so that's probably not it. Try query.toString() to see what
the actual query you're submitting is. Then, I'd think about playing with
that
query in Luke to see what's happening.

Best
ErickOn 7/19/07, Ard Schrijvers <a.schrijvers@hippo.nl> wrote:
>
> Hello Askar,
>
> Which analyzer are you using for indexing and searching? If you use an
> analyzer that uses stemming, you might see that "change", "changing",
> "changed", "chan" etc al get reduced to the same word "chan".
>
> In luke you can test with plugins that show you what tokens are created
> from your analyzer,
>
> Regards Ard
>
>
> >
> >
> > Hey Guys,
> >
> > I just checked my Lucene results. It shows a document with
> > the word hit
> > "change" when I am searching for "Chan", and it considers
> > that as a hit. Is
> > there a way to stop this and show just the exact word match ?
> >
> > I started using Lucene yesterday, so I am fairly new !
> >
> > thanks
> > AZ
> >
> > On 7/18/07, Erick Erickson <erickerickson@gmail.com> wrote:
> > >
> > > Are you sure that the hit wasn't on "w" or "kim"? The
> > > default for searching is OR...
> > >
> > > I recommend that you get a copy of Luke (google lucene luke)
> > > which allows you to examine your index as well as see how
> > > queries parse using various analyzers. It's an invaluable tool...
> > >
> > > Best
> > > Erick
> > >
> > > On 7/18/07, Askar Zaidi <askar.zaidi@gmail.com> wrote:
> > > >
> > > > Hey folks,
> > > >
> > > > I am a new Lucene user , I used the following after indexing:
> > > >
> > > > search(searcher, "W. Chan Kim");
> > > >
> > > > Lucene showed me hits of documents where "channel" word
> > existed. Notice
> > > > that
> > > > "Chan" is a part of "Channel" . How do I stop this ?
> > > >
> > > > I am keen to find the exact word.
> > > >
> > > > I used the following, before the search method:
> > > >
> > > > IndexWriter writer = new IndexWriter(indexPath, new
> > StandardAnalyzer(),
> > > > true);
> > > >
> > > >                                 writer.addDocument
> > > > (createDocument(item,words));
> > > >                                 writer.optimize();
> > > >                                 writer.close();
> > > >                                 searcher = new
> > IndexSearcher(indexPath);
> > > >
> > > > thanks !
> > > >
> > > > AZ
> > > >
> > >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message