lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rakesh Shete <rakesh_sh...@hotmail.com>
Subject RE: Lucene multifield query problem
Date Wed, 19 Dec 2007 12:33:31 GMT

Hey Doron,

It worked!!!! It worked!!!!

Thanks a lot for the help along with some valuable information about the API for creating
queries. Will keep this in mind going ahead.

--Regards,
Rakesh Shete



> Date: Wed, 19 Dec 2007 08:19:53 +0200
> From: cdoronc@gmail.com
> To: java-user@lucene.apache.org
> Subject: Re: Lucene multifield query problem
> 
> Hi Rakseh,
> 
> It just occurred to me that your code has
>      String searchCriteria = "Indoor*";
> 
> Assuming StandardAnalyzer used at indexing time, all text words were
> lowercased. Now, QueryParser by default does not lowercase wildcard
> queries. You can however instruct it to do so by calling:
>      myQueryParser.setLowercaseExpandedTerms(true)
> 
> (The only reason the first query returned anything was that MUST part,
> which, as already discussed, was also sufficient for qualifying a doc.)
> 
> So, you should be able to get the "correct" results with something like
> this:
>      QueryParser qp = new QueryParser("i_title", new StandardAnalyzer());
>      qp.setLowercaseExpandedTerms(true);
>      Query q = qp.parse("+(i_title:Indoor* i_description:Indoor*)
> -i_published:false +i_topicsClasses.id:1_1_*_*");
> 
> Also, note that the printed query for the 2nd attempt, where you used
> TermQueries,
> is somewhat misleading. You may read that as if the query parser generated a
> 
> wildcard query, but actually it is not a wildcard query, just a termQuery,
> expecting
> to find this exact token 'i_title:Indoor*' with the asterisk character in
> the index.
> 
> My additional recommendations to you are (1) also print the rewrite form of
> the
> query (q.rewrite(is.getIndexReader())) - there, for a wildcard query part,
> you will see
> how it was expanded with the terms that match the wildcard expression. You
> will be able to see the lowercasing issue this way, and (2) If this still
> doesn't
> work for you, please post here a tiny simple standalone program that creates
> an index and searches it and demonstrates the problem.
> 
> Regards,
> Doron
> 
> 
> On Dec 19, 2007 6:43 AM, Rakesh Shete <rakesh_shete@hotmail.com> wrote:
> 
> >
> > Hi Doren, Steve. Your suggestions make sense but dont give me the desired
> > results. Here is the code how I generate the query:
> >
> > String searchCriteria = "Indoor*";QueryParser queryparser = new
> > QueryParser("i_title",                    new StandardAnalyzer());Query q1 =
> > queryparser.parse(searchCriteria);queryparser = new
> > QueryParser("i_description",                    new
> > StandardAnalyzer());Query q2 = queryparser.parse(searchCriteria);queryparser
> > = new QueryParser("i_published", new StandardAnalyzer());Query q3 =
> > queryparser.parse("false");queryparser = new
> > QueryParser("i_topicsClasses.id",                    new
> > StandardAnalyzer());Query topicQuery = queryparser.parse("1_1_*_*");BooleanQuery
> > booleanQuery = new BooleanQuery();booleanQuery.add(q1,
> > BooleanClause.Occur.SHOULD);booleanQuery.add(q2,
> > BooleanClause.Occur.SHOULD);booleanQuery.add(q3,
> > BooleanClause.Occur.MUST_NOT);booleanQuery.add(topicQuery,
> > BooleanClause.Occur.MUST);
> >
> >
> > This produces the following query: i_title:indoor* i_description:indoor*
> > -i_published:false +i_topicsClasses.id:1_1_*_*
> > This gives me extra unwanted results from "+i_topicsClasses.id:1_1_*_*".
> >
> > I modified the above code in the foll. manner:
> >
> > Term t = new Term("i_title", searchCriteria);TermQuery termQuery = new
> > TermQuery(t);Term t1 = new Term("i_description", searchCriteria);TermQuery
> > termQuery1 = new TermQuery(t1);BooleanQuery booleanQuery1 = new
> > BooleanQuery();booleanQuery1.add(termQuery, BooleanClause.Occur.SHOULD);booleanQuery1.add(termQuery1,
> > BooleanClause.Occur.SHOULD);BooleanQuery booleanQuery = new
> > BooleanQuery();booleanQuery.add(booleanQuery1, BooleanClause.Occur.MUST);booleanQuery.add(q3,
> > BooleanClause.Occur.MUST_NOT);booleanQuery.add(topicQuery,
> > BooleanClause.Occur.MUST);
> >
> > This produces the following query:
> > +(i_title:Indoor* i_description:Indoor*) -i_published:false
> > +i_topicsClasses.id:1_1_*_*
> > But no results are fetched :(
> >
> > Any hints if am I doing anything wrong?
> >
> > --Regards,
> > Rakesh S
> >
> >
> >
> > > Date: Tue, 18 Dec 2007 23:09:09 +0200
> > > From: cdoronc@gmail.com
> > > To: java-user@lucene.apache.org
> > > Subject: Re: Lucene multifield query problem
> > >
> > > Hi Rakesh,
> > >
> > > Perhaps the confusion comes from the asymmetry
> > > between +X and -X.   I.e., for the query:
> > >       A B -C +D
> > > one might think that, similar to how -C only disqualifies docs
> > > containing C (but not qualifying docs not containing C), also
> > > +D only disqualifies docs not containing D. But this is
> > > inaccurate, because +D, in addition to disqualifying
> > > docs not containing D, also qualifies docs containing D.
> > >
> > > The modified query that Steven suggested:
> > >       +(A B) -C +D
> > > removes this asymmetry, because specifying +(A B)  means
> > > that D is not anymore sufficient to qualify a doc.
> > >
> > > Hope this helps (otherwise let this reply be forever disqualified : - )
> > )
> > > Doron
> > >
> > > On Dec 18, 2007 9:28 PM, Steven A Rowe <sarowe@syr.edu> wrote:
> > >
> > > > Hi Rakesh,
> > > >
> > > > This doesn't look like a user-generated query.  Have you considered
> > > > building the Query via the API instead of using QueryParser?
> > > >
> > > > With QueryParser, you should get the results you want with syntax
> > like:
> > > >
> > > > +(i_title:indoor* OR i_description:indoor*) -i_published:false
> > > > +i_topicsClasses.id:1_1_*_*
> > > >
> > > > Have you tried this yet?
> > > >
> > > > Steve
> > > >
> > > > On 12/18/2007 at 1:58 PM, Rakesh Shete wrote:
> > > > >
> > > > > Thanks for the suggestion Steve. My problem is with getting
> > > > > the correct results. Let me put in words the query :
> > > > >
> > > > > Fetch all documents such that the search string "indoor*" is
> > > > > either part of the 'i_title' field or 'i_description' field,
> > > > > eliminate if not published (-i_published:false) but should
> > > > > have topic id of the form "1_1_*_*" (i_topicsClasses.id:1_1_*_*)
> > > > >
> > > > > i_title:indoor* i_description:indoor* -i_published:false
> > > > > +i_topicsClasses.id:1_1_*_* returns me extra results which
> > > > > should not be fetched.
> > > > >
> > > > > -- Regards,
> > > > > Rakesh Shete
> > > > >
> > > > > > Subject: RE: Lucene multifield query problem
> > > > > > Date: Tue, 18 Dec 2007 13:26:24 -0500
> > > > > > From: sarowe@syr.edu
> > > > > > To: java-user@lucene.apache.org
> > > > > >
> > > > > > Hi Rakesh,
> > > > > >
> > > > > > Set the default QueryParser operator to AND (default default
> > operator
> > > > > > :) is OR):
> > > > > >
> > > > > >
> > > > > <http://lucene.apache.org/java/2_2_0/api/org/apache/lucene/que
> > > > > ryParser/QueryParser.html#setDefaultOperator(org.apache.lucene
> > > > > .queryParser.QueryParser.Operator)>
> > > > > >
> > > > > > Steve
> > > > > >
> > > > > > On 12/18/2007 at 1:22 PM, Rakesh Shete wrote:
> > > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > I am facing problem with the following multifield query:
> > > > > > >
> > > > > > > i_title:indoor* i_description:indoor* -i_published:false
> > > > > > > +i_topicsClasses.id:1_1_*_*
> > > > > > >
> > > > > > > The above query returns me even results which should not
be
> > > > > > > there. Ideally I would like the query resullts as:
> > > > > > >
> > > > > > > (i_title:indoor* i_description:indoor* -i_published:false)
> > > > > > > AND (i_topicsClasses.id:1_1_*_*)
> > > > > > >
> > > > > > > i.e. The intersection of the first part and second part.
> > > > > > >
> > > > > > > But what is happening currently is that I get a union of
the
> > > > > > > first part and second part, i.e., whatever results are
> > > > > > > returned by "i_title:indoor* i_description:indoor*
> > > > > > > -i_published:false" are combined (union) with results
> > > > > > > returned by "+i_topicsClasses.id:1_1_*_*".
> > > > > > >
> > > > > > > How do I write a query that returns me results which are
an
> > > > > > > intersection of the above 2 parts?
> > > > > > >
> > > > > > > --Regards,
> > > > > > > Rakesh S
> > > > > > >
> > > > > > >
> > > > > > >
> > _________________________________________________________________
> > > > Post
> > > > > > > ads for free - to sell, rent or even buy.www.yello.in
> > > > > > > http://ss1.richmedia.in/recurl.asp?pid=186
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > ---------------------------------------------------------------------
> > > > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.orgFor
> > > > > > additional commands, e-mail: java-user-help@lucene.apache.org
> > > > > >
> > > > >
> > > > > _________________________________________________________________
> > Post
> > > > > ads for free - to sell, rent or even buy.www.yello.in
> > > > > http://ss1.richmedia.in/recurl.asp?pid=186
> > > > >
> > > >
> > > >
> > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > > >
> > > >
> >
> > _________________________________________________________________
> > Post free property ads on Yello Classifieds now! www.yello.in
> > http://ss1.richmedia.in/recurl.asp?pid=220

_________________________________________________________________
Post free property ads on Yello Classifieds now! www.yello.in
http://ss1.richmedia.in/recurl.asp?pid=219
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message