Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 16909 invoked from network); 19 Dec 2007 06:20:33 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 19 Dec 2007 06:20:33 -0000 Received: (qmail 19421 invoked by uid 500); 19 Dec 2007 06:20:15 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 19387 invoked by uid 500); 19 Dec 2007 06:20:15 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 19376 invoked by uid 99); 19 Dec 2007 06:20:15 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Dec 2007 22:20:15 -0800 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of cdoronc@gmail.com designates 72.14.220.154 as permitted sender) Received: from [72.14.220.154] (HELO fg-out-1718.google.com) (72.14.220.154) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Dec 2007 06:19:51 +0000 Received: by fg-out-1718.google.com with SMTP id d23so506480fga.27 for ; Tue, 18 Dec 2007 22:19:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; bh=6FfeO4dki3asgAsCswerOGCYehDAF+m9+ynqm9I4IdY=; b=AmTkD0KJGxpB0sBQc1vuX91Lddl3ZcVcuSmg8yjqmbLNi6G8vM4EiS0cph+a/Qc7rwR1HZ8LGMoMWkt3W5D0eh24sBiBXMlLNDC6etGrInoqXdKFUMJiBfW+sl8xSdDEuQkPzoIVVXJc7HVHgQvnTF/LRfO58P3jJPa3WYHEhtw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=ZS6811AkThuvd0qzNuHXuJwTqwMFvbPm+ExuaHbQAtC5lgti3kZOkGCxwOmps627Sf/8AglABLTreY3D2HfjrXbrrrxbYM8/0bN1+rrZrymCCyZihZYcGHLL2AEtv3sl3G9F7OZmSkamMuEDDfLZ19oC3PAJAQq0WoyYquuqQTs= Received: by 10.86.95.20 with SMTP id s20mr8536344fgb.46.1198045193719; Tue, 18 Dec 2007 22:19:53 -0800 (PST) Received: by 10.86.87.18 with HTTP; Tue, 18 Dec 2007 22:19:53 -0800 (PST) Message-ID: Date: Wed, 19 Dec 2007 08:19:53 +0200 From: "Doron Cohen" To: java-user@lucene.apache.org Subject: Re: Lucene multifield query problem In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_1732_28213112.1198045193715" References: <9294E20AED46934EA459020706463F94664304@SUEXCL-02.ad.syr.edu> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_1732_28213112.1198045193715 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Hi Rakseh, It just occurred to me that your code has String searchCriteria = "Indoor*"; Assuming StandardAnalyzer used at indexing time, all text words were lowercased. Now, QueryParser by default does not lowercase wildcard queries. You can however instruct it to do so by calling: myQueryParser.setLowercaseExpandedTerms(true) (The only reason the first query returned anything was that MUST part, which, as already discussed, was also sufficient for qualifying a doc.) So, you should be able to get the "correct" results with something like this: QueryParser qp = new QueryParser("i_title", new StandardAnalyzer()); qp.setLowercaseExpandedTerms(true); Query q = qp.parse("+(i_title:Indoor* i_description:Indoor*) -i_published:false +i_topicsClasses.id:1_1_*_*"); Also, note that the printed query for the 2nd attempt, where you used TermQueries, is somewhat misleading. You may read that as if the query parser generated a wildcard query, but actually it is not a wildcard query, just a termQuery, expecting to find this exact token 'i_title:Indoor*' with the asterisk character in the index. My additional recommendations to you are (1) also print the rewrite form of the query (q.rewrite(is.getIndexReader())) - there, for a wildcard query part, you will see how it was expanded with the terms that match the wildcard expression. You will be able to see the lowercasing issue this way, and (2) If this still doesn't work for you, please post here a tiny simple standalone program that creates an index and searches it and demonstrates the problem. Regards, Doron On Dec 19, 2007 6:43 AM, Rakesh Shete wrote: > > Hi Doren, Steve. Your suggestions make sense but dont give me the desired > results. Here is the code how I generate the query: > > String searchCriteria = "Indoor*";QueryParser queryparser = new > QueryParser("i_title", new StandardAnalyzer());Query q1 = > queryparser.parse(searchCriteria);queryparser = new > QueryParser("i_description", new > StandardAnalyzer());Query q2 = queryparser.parse(searchCriteria);queryparser > = new QueryParser("i_published", new StandardAnalyzer());Query q3 = > queryparser.parse("false");queryparser = new > QueryParser("i_topicsClasses.id", new > StandardAnalyzer());Query topicQuery = queryparser.parse("1_1_*_*");BooleanQuery > booleanQuery = new BooleanQuery();booleanQuery.add(q1, > BooleanClause.Occur.SHOULD);booleanQuery.add(q2, > BooleanClause.Occur.SHOULD);booleanQuery.add(q3, > BooleanClause.Occur.MUST_NOT);booleanQuery.add(topicQuery, > BooleanClause.Occur.MUST); > > > This produces the following query: i_title:indoor* i_description:indoor* > -i_published:false +i_topicsClasses.id:1_1_*_* > This gives me extra unwanted results from "+i_topicsClasses.id:1_1_*_*". > > I modified the above code in the foll. manner: > > Term t = new Term("i_title", searchCriteria);TermQuery termQuery = new > TermQuery(t);Term t1 = new Term("i_description", searchCriteria);TermQuery > termQuery1 = new TermQuery(t1);BooleanQuery booleanQuery1 = new > BooleanQuery();booleanQuery1.add(termQuery, BooleanClause.Occur.SHOULD);booleanQuery1.add(termQuery1, > BooleanClause.Occur.SHOULD);BooleanQuery booleanQuery = new > BooleanQuery();booleanQuery.add(booleanQuery1, BooleanClause.Occur.MUST);booleanQuery.add(q3, > BooleanClause.Occur.MUST_NOT);booleanQuery.add(topicQuery, > BooleanClause.Occur.MUST); > > This produces the following query: > +(i_title:Indoor* i_description:Indoor*) -i_published:false > +i_topicsClasses.id:1_1_*_* > But no results are fetched :( > > Any hints if am I doing anything wrong? > > --Regards, > Rakesh S > > > > > Date: Tue, 18 Dec 2007 23:09:09 +0200 > > From: cdoronc@gmail.com > > To: java-user@lucene.apache.org > > Subject: Re: Lucene multifield query problem > > > > Hi Rakesh, > > > > Perhaps the confusion comes from the asymmetry > > between +X and -X. I.e., for the query: > > A B -C +D > > one might think that, similar to how -C only disqualifies docs > > containing C (but not qualifying docs not containing C), also > > +D only disqualifies docs not containing D. But this is > > inaccurate, because +D, in addition to disqualifying > > docs not containing D, also qualifies docs containing D. > > > > The modified query that Steven suggested: > > +(A B) -C +D > > removes this asymmetry, because specifying +(A B) means > > that D is not anymore sufficient to qualify a doc. > > > > Hope this helps (otherwise let this reply be forever disqualified : - ) > ) > > Doron > > > > On Dec 18, 2007 9:28 PM, Steven A Rowe wrote: > > > > > Hi Rakesh, > > > > > > This doesn't look like a user-generated query. Have you considered > > > building the Query via the API instead of using QueryParser? > > > > > > With QueryParser, you should get the results you want with syntax > like: > > > > > > +(i_title:indoor* OR i_description:indoor*) -i_published:false > > > +i_topicsClasses.id:1_1_*_* > > > > > > Have you tried this yet? > > > > > > Steve > > > > > > On 12/18/2007 at 1:58 PM, Rakesh Shete wrote: > > > > > > > > Thanks for the suggestion Steve. My problem is with getting > > > > the correct results. Let me put in words the query : > > > > > > > > Fetch all documents such that the search string "indoor*" is > > > > either part of the 'i_title' field or 'i_description' field, > > > > eliminate if not published (-i_published:false) but should > > > > have topic id of the form "1_1_*_*" (i_topicsClasses.id:1_1_*_*) > > > > > > > > i_title:indoor* i_description:indoor* -i_published:false > > > > +i_topicsClasses.id:1_1_*_* returns me extra results which > > > > should not be fetched. > > > > > > > > -- Regards, > > > > Rakesh Shete > > > > > > > > > Subject: RE: Lucene multifield query problem > > > > > Date: Tue, 18 Dec 2007 13:26:24 -0500 > > > > > From: sarowe@syr.edu > > > > > To: java-user@lucene.apache.org > > > > > > > > > > Hi Rakesh, > > > > > > > > > > Set the default QueryParser operator to AND (default default > operator > > > > > :) is OR): > > > > > > > > > > > > > > > > > ryParser/QueryParser.html#setDefaultOperator(org.apache.lucene > > > > .queryParser.QueryParser.Operator)> > > > > > > > > > > Steve > > > > > > > > > > On 12/18/2007 at 1:22 PM, Rakesh Shete wrote: > > > > > > > > > > > > Hi all, > > > > > > > > > > > > I am facing problem with the following multifield query: > > > > > > > > > > > > i_title:indoor* i_description:indoor* -i_published:false > > > > > > +i_topicsClasses.id:1_1_*_* > > > > > > > > > > > > The above query returns me even results which should not be > > > > > > there. Ideally I would like the query resullts as: > > > > > > > > > > > > (i_title:indoor* i_description:indoor* -i_published:false) > > > > > > AND (i_topicsClasses.id:1_1_*_*) > > > > > > > > > > > > i.e. The intersection of the first part and second part. > > > > > > > > > > > > But what is happening currently is that I get a union of the > > > > > > first part and second part, i.e., whatever results are > > > > > > returned by "i_title:indoor* i_description:indoor* > > > > > > -i_published:false" are combined (union) with results > > > > > > returned by "+i_topicsClasses.id:1_1_*_*". > > > > > > > > > > > > How do I write a query that returns me results which are an > > > > > > intersection of the above 2 parts? > > > > > > > > > > > > --Regards, > > > > > > Rakesh S > > > > > > > > > > > > > > > > > > > _________________________________________________________________ > > > Post > > > > > > ads for free - to sell, rent or even buy.www.yello.in > > > > > > http://ss1.richmedia.in/recurl.asp?pid=186 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.orgFor > > > > > additional commands, e-mail: java-user-help@lucene.apache.org > > > > > > > > > > > > > _________________________________________________________________ > Post > > > > ads for free - to sell, rent or even buy.www.yello.in > > > > http://ss1.richmedia.in/recurl.asp?pid=186 > > > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > > For additional commands, e-mail: java-user-help@lucene.apache.org > > > > > > > > _________________________________________________________________ > Post free property ads on Yello Classifieds now! www.yello.in > http://ss1.richmedia.in/recurl.asp?pid=220 ------=_Part_1732_28213112.1198045193715--