Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@apache.org Received: (qmail 71429 invoked from network); 23 Oct 2002 17:37:49 -0000 Received: from unknown (HELO nagoya.betaversion.org) (192.18.49.131) by daedalus.apache.org with SMTP; 23 Oct 2002 17:37:49 -0000 Received: (qmail 11678 invoked by uid 97); 23 Oct 2002 17:38:18 -0000 Delivered-To: qmlist-jakarta-archive-lucene-user@jakarta.apache.org Received: (qmail 11337 invoked by uid 97); 23 Oct 2002 17:38:15 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 11268 invoked by uid 98); 23 Oct 2002 17:38:15 -0000 X-Antivirus: nagoya (v4218 created Aug 14 2002) Date: Wed, 23 Oct 2002 10:41:59 -0700 (Pacific Daylight Time) From: "Joshua O'Madadhain" To: Lucene Users List Subject: Re: definite matching In-Reply-To: Message-ID: X-X-Sender: jmadden@smtp.ics.uci.edu MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-MailScanner: Found to be clean X-ICS-MailScanner-SpamCheck: not spam, SpamAssassin (score=-2.7, required 5, EMAIL_ATTRIBUTION, IN_REP_TO, QUOTED_EMAIL_TEXT, SPAM_PHRASE_03_05, USER_AGENT_PINE) X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Mr. Toaster (or can I just call you "Stray"? ;> ): Everything Peter (and Otis) said is true. A point of clarification that may forestall future confusion: the "+" and "-" terms correspond to "required" and "prohibited", respectively, in a BooleanQuery. (Calling it a BooleanQuery is a slight misnomer to my way of thinking, since as Peter implied, the semantics of "-" aren't quite the same as "NOT", since you can't just search for "NOT foo".) So to summarize: * "+": documents *without* this term/phrase are *not* returned * "-": documents *with* this term/phrase are *not* returned * default (no flag): documents with this term/phrase get a higher score (all other things being equal) than documents which lack it. * terms/phrases cannot be specified to be both required and prohibited. Personally I'd stick with "+" and "-" rather than using the Boolean terms, simply because I seem to recall that there are some constructs involving Boolean connectives that don't behave quite the way that you might expect. (Anyone know what happens if you search for "jam OR NOT(toast AND bread)"?) Good luck-- Joshua On Wed, 23 Oct 2002, Peter Carlson wrote: > Anyway if you want to only find documents with a given term or set of > terms put a + in front of EACH term you are searching for > > +hello +world > > You can also use the AND construct if you are using the QueryParser. > > hello AND world > > this gets translated into > +hello +world > > The other options are to have a minus sign (-) which will returns > documents that don't have that term > > +hello -world > > will find all documents with the term hello and not world. > Note: You cannot use the - option alone. > > Also you can use NOT in the same way > > hello NOT world > > results in > > hello -world > > > Finally the OR operator (the current default) operator between terms > > hello world > > or equivalently > > hello OR world > > will find all documents with hello or world in the field. jmadden@ics.uci.edu...Obscurium Per Obscurius...www.ics.uci.edu/~jmadden Joshua O'Madadhain: Information Scientist, Musician, Philosopher-At-Tall It's that moment of dawning comprehension that I live for--Bill Watterson My opinions are too rational and insightful to be those of any organization. -- To unsubscribe, e-mail: For additional commands, e-mail: