lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Chesky <Bill.Che...@learninga-z.com>
Subject RE: Analyzer on query question
Date Fri, 03 Aug 2012 16:57:41 GMT
Ian,

I gave this method a try, at least the way I understood your suggestion. E.g. to search for
the phrase "cells combine" I built up a string like:

title:"cells combine" description:"cells combine" text:"cells combine"

then I passed that to the queryParser.parse() method (where queryParser is an instance of
QueryParser constructed using SnowballAnalyzer) and added the result as a MUST clause in my
final BooleanQuery.

When I print the resulting query out as a string I get:

+(title:"cell combin" description:"cell combin" keywords:"cell combin")

So it looks like the SnowballAnalyzer is doing some stemming for me.  But this is the exact
same result I'd get doing it the way I described in my original email.  I just built the unanalyzed
string on my own rather than using the various query classes like PhraseQuery, etc.  

So I don't see the advantage to doing it this way over the original method.  I just don't
know if the original way I described is wrong or will give me bad results.

thanks for the help,

Bill

-----Original Message-----
From: Ian Lea [mailto:ian.lea@gmail.com] 
Sent: Friday, August 03, 2012 9:32 AM
To: java-user@lucene.apache.org
Subject: Re: Analyzer on query question

You can add parsed queries to a BooleanQuery.  Would that help in this case?

SnowballAnalyzer sba = whatever();
QueryParser qp = new QueryParser(..., sba);
Query q1 = qp.parse("some snowball string");
Query q2 = qp.parse("some other snowball string");

BooleanQuery bq = new BooleanQuery();
bq.add(q1, ...);
bq.add(q2, ...);
bq.add(loads of other stuff);


--
ian.


On Fri, Aug 3, 2012 at 2:19 PM, Bill Chesky <Bill.Chesky@learninga-z.com> wrote:
> Thanks Simon,
>
> Unfortunately, I'm using Lucene 3.0.1 and CharTermAttribute doesn't seem to have been
introduced until 3.1.0.  Similarly my version of Lucene does not have a BooleanQuery.addClause(BooleanClause)
method.  Maybe you meant BooleanQuery.add(BooleanClause).

>
> In any case, most of what you're doing there, I'm just not familiar with.  Seems very
low level.  I've never had to use TokenStreams to build a query before and I'm not really
sure what is going on there.  Also, I don't know what PositionIncrementAttribute is or how
it would be used to create a PhraseQuery.   The way I'm currently creating PhraseQuerys is
very straightforward and intuitive.  E.g. to search for the term "foo bar" I'd build the query
like this:
>
>                                                 PhraseQuery phraseQuery = new PhraseQuery();
>                                                 phraseQuery.add(new Term("title", "foo"));
>                                                 phraseQuery.add(new Term("title", "bar"));
>
> Is there really no easier way to associate the correct analyzer with these types of queries?
>
> Bill
>
> -----Original Message-----
> From: Simon Willnauer [mailto:simon.willnauer@gmail.com]
> Sent: Friday, August 03, 2012 3:43 AM
> To: java-user@lucene.apache.org; Bill Chesky
> Subject: Re: Analyzer on query question
>
> On Thu, Aug 2, 2012 at 11:09 PM, Bill Chesky
> <Bill.Chesky@learninga-z.com> wrote:
>> Hi,
>>
>> I understand that generally speaking you should use the same analyzer on querying
as was used on indexing.  In my code I am using the SnowballAnalyzer on index creation.  However,
on the query side I am building up a complex BooleanQuery from other BooleanQuerys and/or
PhraseQuerys on several fields.  None of these require specifying an analyzer anywhere.  This
is causing some odd results, I think, because a different analyzer (or no analyzer?) is being
used for the query.
>>
>> Question: how do I build my boolean and phrase queries using the SnowballAnalyzer?
>>
>> One thing I did that seemed to kind of work was to build my complex query normally
then build a snowball-analyzed query using a QueryParser instantiated with a SnowballAnalyzer.
 To do this, I simply pass the string value of the complex query to the QueryParser.parse()
method to get the new query.  Something like this:
>>
>>     // build a complex query from other BooleanQuerys and PhraseQuerys
>>     BooleanQuery fullQuery = buildComplexQuery();
>>     QueryParser parser = new QueryParser(Version.LUCENE_30, "title", new SnowballAnalyzer(Version.LUCENE_30,
"English"));
>>     Query snowballAnalyzedQuery = parser.parse(fullQuery.toString());
>>
>>     TopScoreDocCollector collector = TopScoreDocCollector.create(10000, true);
>>     indexSearcher.search(snowballAnalyzedQuery, collector);
>
> you can just use the analyzer directly like this:
> Analyzer analyzer = new SnowballAnalyzer(Version.LUCENE_30, "English");
>
> TokenStream stream = analyzer.tokenStream("title", new
> StringReader(fullQuery.toString()):
> CharTermAttribute termAttr = stream.addAttribute(CharTermAttribute.class);
> stream.reset();
> BooleanQuery q = new BooleanQuery();
> while(stream.incrementToken()) {
>   q.addClause(new BooleanClause(Occur.MUST, new Term("title",
> termAttr.toString())));
> }
>
> you also have access to the token positions if you want to create
> phrase queries etc. just add a PositionIncrementAttribute like this:
> PositionIncrementAttribute posAttr =
> stream.addAttribute(PositionsIncrementAttribute.class);
>
> pls. doublecheck the code it's straight from the top of my head.
>
> simon
>
>>
>> Like I said, this seems to kind of work but it doesn't feel right.  Does this make
sense?  Is there a better way?
>>
>> thanks in advance,
>>
>> Bill
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message