Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 27302 invoked from network); 15 Sep 2008 01:53:27 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 15 Sep 2008 01:53:27 -0000 Received: (qmail 28430 invoked by uid 500); 15 Sep 2008 01:53:17 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 28395 invoked by uid 500); 15 Sep 2008 01:53:17 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 28384 invoked by uid 99); 15 Sep 2008 01:53:17 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 14 Sep 2008 18:53:17 -0700 X-ASF-Spam-Status: No, hits=0.2 required=10.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [66.39.3.230] (HELO atmail-web1.pair.com) (66.39.3.230) by apache.org (qpsmtpd/0.29) with SMTP; Mon, 15 Sep 2008 01:52:19 +0000 Received: (qmail 97045 invoked from network); 15 Sep 2008 01:52:50 -0000 Received: from localhost.pair.com (HELO webmail.pair.com) (127.0.0.1) by localhost.pair.com with SMTP; 15 Sep 2008 01:52:50 -0000 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 From: To: java-user@lucene.apache.org, Mark Miller Subject: Re: SpanQuery problem Reply-To: darren@ontrenet.com X-Origin: 146.145.145.245 Date: Sun, 14 Sep 2008 21:52:50 -0400 Message-Id: <49942.1221443570@ontrenet.com> X-Mailer: AtMail 4.61 - 146.145.145.245 - darren@ontrenet.com X-Virus-Checked: Checked by ClamAV on apache.org Yes, return the document, but the problem is with SpanNearQuery it does not= return the documents I expect. Sorry I did not explain it well. Consider 2 documents each with "word" fiel= ds. Document 1 word: blue bird word: blue car Document 2 word: sky blue word: sea blue I want to search for 'blue' and ONLY return Document 1 as I already know that the term 'blue' MUST appear at the front of the field word: SpanNearQuery with slop of 0 or 1 won't do this if Document 1 has other fie= lds like this. Document 1 - IS NOT FOUND WITH SPAN NEAR 0 or 1 word: some blue word: another blue word: blue bird word: blue car Expanding the Span slop to 3 will find Document 1 above this line, however I thought the slop meant within the field terms. It seems to refer to the l= ist of fields rather than terms. This is unexpected behavior to me. But I'm= no lucene expert. Thanks for any thoughts. Darren darren@ontrenet.com wrote: > Thanks Paul. I will study your response more, as I don't fully understand= it yet - specifically "You'll need to expand the prefix into indexed terms= ". > > But what I want to do is so simple I'm surprised it cannot be done.=20 > > You are saying that I cannot find all fields across all documents that be= gin with a string or space bounded word? Consider 1 document with: > > word: blue car > word: red car > word: car door > word: car wheel > > Using whitespace analyzer I simply want to query all fields in all docume= nts > where 'car' is the at the very front of the field. > > word: car door > word: car wheel > > This cannot be done? I don't want to retrieve all of them and prune the r= esults myself because it will consume lots of resources. > > thanks so much! > > Darren=20 > > On Sun Sep 14 16:36 , Paul Elschot sent:Op Sunday 14 September 2008 19:3= 6:38 schreef Darren Govoni: >=20=20=20 >> Hi, >> I am seeing odd behavior with SpanNearQuery. >> >> The problem is that with multiple fields, all fields beyond the first >> one 'car' are not seen by the span. I didn't think the span meant to >> sets of the same field, but rather to terms within a given field. >> >> Document 1. 1 field (word) >> >> word: car >> word: cars >> word: cars wash >> word: cars lot >> >> >> SpanNearyQuery with slop of 0. Wrapped by SpanFirstQuery with slop of >> 1. Term query within is "word","cars*". No results found. >>=20=20=20=20=20 > > There is no SpanPrefixQuery for cars* in Lucene. You'll need to > expand the prefix into indexed terms to create a SpanOrQuery > yourself. This is fairly straightforward from PrefixQuery and > SpanOrQuery. > Alternatively, have a look at the surround query parser in contrib > for a working example. > > Regards, > Paul Elschot > >=20=20=20 >> If I remove the first field word: car, it works. Also, if I increase >> the slop, it will return results from only the first amount of fields >> in the slop rather than terms within the field value. >> >> Is what I am seeing the correct behavior? Doesn't seem like it. >> >> What I am trying to do is span _within_ EACH field and match phrases >> that begin with "cars*". Shouldn't be too hard to do I thought. >> >> Darren >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-user-help@lucene.apache.org >>=20=20=20=20=20 > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > >=20=20=20 --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org