Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 35976 invoked from network); 20 Dec 2005 20:14:25 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 20 Dec 2005 20:14:25 -0000 Received: (qmail 22145 invoked by uid 500); 20 Dec 2005 20:14:18 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 22127 invoked by uid 500); 20 Dec 2005 20:14:17 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 22116 invoked by uid 99); 20 Dec 2005 20:14:17 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Dec 2005 12:14:17 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: neutral (asf.osuosl.org: local policy) Received: from [169.229.70.167] (HELO rescomp.berkeley.edu) (169.229.70.167) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Dec 2005 12:14:16 -0800 Received: by rescomp.berkeley.edu (Postfix, from userid 1007) id 609FE5B77F; Tue, 20 Dec 2005 12:13:53 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by rescomp.berkeley.edu (Postfix) with ESMTP id 571A87F459 for ; Tue, 20 Dec 2005 12:13:53 -0800 (PST) Date: Tue, 20 Dec 2005 12:13:53 -0800 (PST) From: Chris Hostetter Sender: hossman@hal.rescomp.berkeley.edu To: java-user@lucene.apache.org Subject: Re: all stop words in exact phrase get 0 hits In-Reply-To: <14F8D9C2-8C61-4AFE-BC94-3D8C8F4AA00D@ehatchersolutions.com> Message-ID: References: <15EAD065-210A-46CD-AA06-8D5BC54C7733@ehatchersolutions.com> <14F8D9C2-8C61-4AFE-BC94-3D8C8F4AA00D@ehatchersolutions.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N : Hoss - the main caveat with this approach is that a user could select : a different field from any of the designated text boxes. Putting : "subject:foo" in the name text box for example. sorry ... yea, i left out a step... String words = QueryParser.escape(rawUserInputString) Generally speaking this approach has been working well for me in my current development as a bare bones method for converting raw user word lists into pieces of queries that i then compose into a larger more complex query structure. the only hitch i've ever really run into is when people try to search for things like: 42" tv ... becuase query parser doesn't have any way to escape that quote, so it barfs thinking it's an unterminated phrase query. but even if there was a way to escape it, i don't think i'd want to, beause as is it lets the (more common case) of quoted phrases do what (i) expect and become phrase queries. I still haven't completley decided what i want to do about the: 42" tv case ... but i'm very close to preprocessing the input and saying: "even number of quotes, leave them alone, odd number of quotes, yank them all out." : I think this is one of my biggest issues with QueryParser these days, : it exposes too much power. It's like giving a SQL text box to your : database application (read-only, of course). There are generally : fields not made for user querying. agreed. If i don't take the easy way out on that 42" case, i think I may spend some time on a generic "UserQueryParser" that doesn't support any fancy syntax of the exsting QueryParser, and just knows how to build TermQuery and phrases queries on a specified field -- with options forhow to deal with uneven numbered quotes (looking at the white space arround the quotes for guidence in interpreting it) : > In addition to Erik's suggestion, something that i find frequently : > makes : > sense is to use the QueryParser in each case where you are dealing : > with a : > discrete "input string" -- and then combine them into a BooleanQuery : > (along with any other progromatically generated TermQueries you : > might want : > for other reasons) : > : > For example, it looks like your applciation is targeted at : > searching email : > right? let's assume your application allows the user to specify the : > following inputs : > : > * a text box for search words : > * a text box for people's names/email : > * pulldowns for picking date ranges : > : > ...and you want to look in the subject, body, and attachment fields : > for : > the input words (with differnet boosts) and in all of the header : > fields : > for the name/email input. you can builda Date object out of the : > pulldown : > yourself, and then do something like this with the two strings... : > : > QueryParser subP = new QueryParser("subject",SomeAnalyzer) : > QueryParser bodyP = new QueryParser("body",SomeAnalyzer) : > QueryParser attachP = new QueryParser("attachemnts",SomeAnalyzer) : > QueryParser headP = new QueryParser : > ("headers",AnalyzerThatKnowsAboutEmailAddress) : > Query s = subP.parse(words); : > s.setBoost(2.0) : > Query b = bodyP.parse(words); : > Query a = attachP.parse(words); : > a.setBoost(0.5) : > Query h = headP.parse(person); : > h.setBoost(1.5) : > BooleanQuery bq = new BooleanQuery(); : > bq.add(s,false,false); : > bq.add(b,false,false); : > bq.add(a,false,false); : > bq.add(h,false,false); : > : > ...and then execute your search using a filter built from the date : > range : > pulldowns. : > : > : > -Hoss : > : > : > --------------------------------------------------------------------- : > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org : > For additional commands, e-mail: java-user-help@lucene.apache.org : : : --------------------------------------------------------------------- : To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org : For additional commands, e-mail: java-user-help@lucene.apache.org : -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org