lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Felipe Carvalho <felipe.carva...@gmail.com>
Subject Re: Don't get results wheras Luke does...
Date Tue, 06 Dec 2011 12:19:36 GMT
I had a similar problem. The problem was the "-' char, which is a special
char for Lucene. You can try indexing the data in lowercase and use
WhitespaceAnalyzer for both indexing and searching over the field. One
other option is replace "-" with "_" when indexing and searching. This way,
your data won't be indexed with any special chars. One lesson I've learned
is to leave upper case characters to be used only for operators. Data that
will be searched upon should always be lowercase.

On Tue, Dec 6, 2011 at 10:01 AM, Ian Lea <ian.lea@gmail.com> wrote:

> Try QueryParser.setLowercaseExpandedTerms(false).  QueryParser will
> lowercase terms in prefix etc queries by default.
>
> If that doesn't work, and it was my problem, I'd just lowercase
> everything, everywhere.  Life's too short to mess around with case
> issues.
>
>
> --
> Ian.
>
>
> On Tue, Dec 6, 2011 at 8:12 AM, ejblom <ejblom@gmail.com> wrote:
> > Dear Lucene-users,
> >
> > I am a bit puzzled over this. I have a query which should return some
> > documents, if I use Luke, I obtain hits using the
> > org.apache.lucene.analysis.KeywordAnalyzer.
> >
> > This is the query:
> >
> > domain:NB-AR*
> >
> > (I have data indexed using:
> >
> > doc.add(new Field("domain", NB-ARC, Field.Store.YES,
> > Field.Index.NOT_ANALYZED));  )
> >
> > Explain structure reveals that Luke is employing a PrefixQuery. Ok, now I
> > want to obtain these results using my Java application:
> >
> > //Using the QueryParser, let him decide what to do with it:
> >
> > Query q = new QueryParser(Version.LUCENE_35, "contents",
> > analyzer).parse("domain:NB-AR*");
> > System.out.println("Type of query: " + q.getClass().getSimpleName());
> >
> > // Type of query: PrefixQuery so that's ok
> >
> > int hitsPerPage = 1000;
> > TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage,
> > true);
> > searcher.search(q, collector);
> > ScoreDoc[] hits = collector.topDocs().scoreDocs;
> > System.out.println("Found " + hits.length + " hits.");
> >
> > // Unfortunately 0 hits.
> >
> > // move on and make specify a Term and PrefixQuery:
> >
> > Term term = new Term("domain", "NB-AR");
> > q = new PrefixQuery(term);
> > collector = TopScoreDocCollector.create(hitsPerPage, true);
> > searcher.search(q, collector);
> > hits = collector.topDocs().scoreDocs;
> >
> > // Found with prefix 441 hits.
> >
> >
> >
> > I tried to lowercase the search query, re-index and made the field:
> > Field.Index.ANALYZED but nothing worked...
> >
> > I have a feeling it is something very trivial, but I just can't figure it
> > out...
> >
> > Anyone?
> >
> > EJ Blom
> >
> >
> > --
> > View this message in context:
> http://lucene.472066.n3.nabble.com/Don-t-get-results-wheras-Luke-does-tp3563736p3563736.html
> > Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message