lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jm <jmugur...@gmail.com>
Subject Re: analizer not doing the same thing at index and query time?
Date Tue, 12 Jul 2011 11:31:00 GMT
*But I think I am! I build the QueryParser with the same analyzer I used at
index time...
*
*
*
*Anyway, I'll try to build a self contained example independent of my app
code.
*

On Tue, Jul 12, 2011 at 1:01 PM, Ian Lea <ian.lea@gmail.com> wrote:

> If I've read your example correctly it appears that at indexing your
> analyzer is converting "bloom's" to "bloom" but not at search time.
> Which implies that you aren't using the same analyzer in both cases.
>
>
> --
> Ian.
>
>
> On Mon, Jul 11, 2011 at 4:19 PM, jm <jmuguruza@gmail.com> wrote:
> > *Hi,*
> > *
> > *
> > *My env is jdk1.6 and lucene3.3.*
> > *
> > *
> > *At index time I have this:*
> > *
> > *
> > *
> >        Directory directory = FSDirectory.open(new
> > File("d:\\temp\\lucene.index"));
> >        IndexWriter writer = new IndexWriter(directory, myAnalizer,
> > IndexWriter.MaxFieldLength.UNLIMITED);
> >        Document doc = new Document();
> >        doc.add((Fieldable) new Field("bbbb", "bloom's bird",
> Field.Store.NO,
> > Field.Index.ANALYZED));
> >         writer.addDocument(doc);
> >        // add another doc
> > **
> >        writer.close(); // 3
> > *
> > *
> >
> > *
> > *
> > And I know only 'bloom' and 'bird' are indexed (I verify with luke). My
> > analyzer removes all non-alphanumeric chars.
> > *
> > *
> >
> > *
> > *
> > At query time I do this:
> > *
> > *
> >
> > *
> > *
> >        QueryParser
> > **
> > qp
> > **
> > = new QueryParser(LUCENEVERSIONCOMPAT, FBODY,
> > **
> > myAnalizer
> > **
> > );
> > *
> > *
> >        printHitCountQP(directory, qp, "bbbb:(*bloom's*)");
> >        printHitCountQP(directory, qp, "bbbb:(*bloom)");
> >        printHitCountQP(directory, qp, "bbbb:(*bloom AND b*)");
> >
> >    protected static void printHitCountQP(Directory directory, QueryParser
> > qp, String searchString) throws IOException, ParseException {
> >        IndexSearcher searcher = new IndexSearcher(directory, true); //5
> >        Query query = qp.parse(searchString);
> >        int hitCount = searcher.search(query, 1).totalHits;
> >        searcher.close();
> >        System.out.println(searchString + " got " + hitCount + " Query is:
> "
> > + query.toString());
> >    }
> >
> > And I get this:
> >
> > bbbb:(*bloom's*) got 0 Query is: bbbb:*bloom's*
> > bbbb:(*bloom) got 2 Query is: bbbb:*bloom
> > bbbb:(*bloom AND b*) got 1 Query is: +bbbb:*bloom +bbbb:b*
> >
> > Queries 2 and 3 are ok, but I don't understand the first case, shouldn't
> > have it removed the ' as I am using the same analyzer than I did at index
> > time??
> >
> > thanks
> > *
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message