lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Chesky <Bill.Che...@learninga-z.com>
Subject RE: Analyzer on query question
Date Fri, 03 Aug 2012 13:19:52 GMT
Thanks Simon,

Unfortunately, I'm using Lucene 3.0.1 and CharTermAttribute doesn't seem to have been introduced
until 3.1.0.  Similarly my version of Lucene does not have a BooleanQuery.addClause(BooleanClause)
method.  Maybe you meant BooleanQuery.add(BooleanClause).

In any case, most of what you're doing there, I'm just not familiar with.  Seems very low
level.  I've never had to use TokenStreams to build a query before and I'm not really sure
what is going on there.  Also, I don't know what PositionIncrementAttribute is or how it would
be used to create a PhraseQuery.   The way I'm currently creating PhraseQuerys is very straightforward
and intuitive.  E.g. to search for the term "foo bar" I'd build the query like this:

						PhraseQuery phraseQuery = new PhraseQuery();
						phraseQuery.add(new Term("title", "foo"));
						phraseQuery.add(new Term("title", "bar"));

Is there really no easier way to associate the correct analyzer with these types of queries?

Bill

-----Original Message-----
From: Simon Willnauer [mailto:simon.willnauer@gmail.com] 
Sent: Friday, August 03, 2012 3:43 AM
To: java-user@lucene.apache.org; Bill Chesky
Subject: Re: Analyzer on query question

On Thu, Aug 2, 2012 at 11:09 PM, Bill Chesky
<Bill.Chesky@learninga-z.com> wrote:
> Hi,
>
> I understand that generally speaking you should use the same analyzer on querying as
was used on indexing.  In my code I am using the SnowballAnalyzer on index creation.  However,
on the query side I am building up a complex BooleanQuery from other BooleanQuerys and/or
PhraseQuerys on several fields.  None of these require specifying an analyzer anywhere.  This
is causing some odd results, I think, because a different analyzer (or no analyzer?) is being
used for the query.
>
> Question: how do I build my boolean and phrase queries using the SnowballAnalyzer?
>
> One thing I did that seemed to kind of work was to build my complex query normally then
build a snowball-analyzed query using a QueryParser instantiated with a SnowballAnalyzer.
 To do this, I simply pass the string value of the complex query to the QueryParser.parse()
method to get the new query.  Something like this:
>
>     // build a complex query from other BooleanQuerys and PhraseQuerys
>     BooleanQuery fullQuery = buildComplexQuery();
>     QueryParser parser = new QueryParser(Version.LUCENE_30, "title", new SnowballAnalyzer(Version.LUCENE_30,
"English"));
>     Query snowballAnalyzedQuery = parser.parse(fullQuery.toString());
>
>     TopScoreDocCollector collector = TopScoreDocCollector.create(10000, true);
>     indexSearcher.search(snowballAnalyzedQuery, collector);

you can just use the analyzer directly like this:
Analyzer analyzer = new SnowballAnalyzer(Version.LUCENE_30, "English");

TokenStream stream = analyzer.tokenStream("title", new
StringReader(fullQuery.toString()):
CharTermAttribute termAttr = stream.addAttribute(CharTermAttribute.class);
stream.reset();
BooleanQuery q = new BooleanQuery();
while(stream.incrementToken()) {
  q.addClause(new BooleanClause(Occur.MUST, new Term("title",
termAttr.toString())));
}

you also have access to the token positions if you want to create
phrase queries etc. just add a PositionIncrementAttribute like this:
PositionIncrementAttribute posAttr =
stream.addAttribute(PositionsIncrementAttribute.class);

pls. doublecheck the code it's straight from the top of my head.

simon

>
> Like I said, this seems to kind of work but it doesn't feel right.  Does this make sense?
 Is there a better way?
>
> thanks in advance,
>
> Bill

Mime
View raw message