Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 18712 invoked from network); 11 Jun 2004 00:15:38 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 11 Jun 2004 00:15:38 -0000 Received: (qmail 27300 invoked by uid 500); 11 Jun 2004 00:15:56 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 27123 invoked by uid 500); 11 Jun 2004 00:15:55 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 27109 invoked by uid 99); 11 Jun 2004 00:15:55 -0000 Received: from [24.51.109.181] (HELO postfix.mail.ehatchersolutions.com) (24.51.109.181) by apache.org (qpsmtpd/0.27.1) with ESMTP; Thu, 10 Jun 2004 17:15:55 -0700 Received: from [127.0.0.1] (localhost [127.0.0.1]) by postfix.mail.ehatchersolutions.com (Postfix) with ESMTP id 569A477F700 for ; Thu, 10 Jun 2004 20:15:23 -0400 (EDT) Mime-Version: 1.0 (Apple Message framework v613) In-Reply-To: <665B4F30-BB20-11D8-909A-000A95D01A94@ganyo.com> References: <40C8645C.2070708@isb-sib.ch> <7462A841-BAF3-11D8-9DDD-000393A564E6@ehatchersolutions.com> <009401c44f16$afb5d260$6501a8c0@POWERPACK> <74DB322A-BB0B-11D8-97B2-000393A564E6@ehatchersolutions.com> <00e301c44f26$89b3a5a0$6501a8c0@POWERPACK> <66674895-BB1C-11D8-A769-000393A564E6@ehatchersolutions.com> <665B4F30-BB20-11D8-909A-000A95D01A94@ganyo.com> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <6D8F3B8E-BB3C-11D8-8184-000393A564E6@ehatchersolutions.com> Content-Transfer-Encoding: 7bit From: Erik Hatcher Subject: Re: Open-ended range queries Date: Thu, 10 Jun 2004 20:15:21 -0400 To: "Lucene Users List" X-Mailer: Apple Mail (2.613) X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N On Jun 10, 2004, at 4:54 PM, Scott ganyo wrote: > It looks to me like Revision 1.18 broke it. It seems this could be it: revision 1.18 date: 2002/06/25 00:05:31; author: briangoetz; state: Exp; lines: +62 -33 Support for new range query syntax. The delimiter is " TO ", but is optional for backward compatibility with previous syntax. If the range arguments match the format supported by DateFormat.getDateInstance(DateFormat.SHORT), then they will be converted into the appropriate date strings a la DateField. Added Field.Keyword "constructor" for Date-valued arguments. Optimized DateField.timeToString function. But geez.... June 2002.... and no one has complained since? Given that this is so outdated, I'm not sure what the right course of action is. There are lots more Lucene users now than there were then. Would adding NULL back be what folks want? What about simply an asterisk to denote open ended-ness? [* TO term] or [term TO *] For completeness, here is the diff: % cvs diff -u -r 1.17 -r 1.18 QueryParser.jj Index: QueryParser.jj =================================================================== RCS file: /home/cvs/jakarta-lucene/src/java/org/apache/lucene/queryParser/ QueryParser.jj,v retrieving revision 1.17 retrieving revision 1.18 diff -u -r1.17 -r1.18 --- QueryParser.jj 20 May 2002 15:45:43 -0000 1.17 +++ QueryParser.jj 25 Jun 2002 00:05:31 -0000 1.18 @@ -65,8 +65,11 @@ import java.util.Vector; import java.io.*; +import java.text.*; +import java.util.*; import org.apache.lucene.index.Term; import org.apache.lucene.analysis.*; +import org.apache.lucene.document.*; import org.apache.lucene.search.*; /** @@ -218,35 +221,30 @@ private Query getRangeQuery(String field, Analyzer analyzer, - String queryText, + String part1, + String part2, boolean inclusive) { - // Use the analyzer to get all the tokens. There should be 1 or 2. - TokenStream source = analyzer.tokenStream(field, - new StringReader(queryText)); - Term[] terms = new Term[2]; - org.apache.lucene.analysis.Token t; + boolean isDate = false, isNumber = false; - for (int i = 0; i < 2; i++) - { - try - { - t = source.next(); - } - catch (IOException e) - { - t = null; - } - if (t != null) - { - String text = t.termText(); - if (!text.equalsIgnoreCase("NULL")) - { - terms[i] = new Term(field, text); - } - } + try { + DateFormat df = DateFormat.getDateInstance(DateFormat.SHORT); + df.setLenient(true); + Date d1 = df.parse(part1); + Date d2 = df.parse(part2); + part1 = DateField.dateToString(d1); + part2 = DateField.dateToString(d2); + isDate = true; } - return new RangeQuery(terms[0], terms[1], inclusive); + catch (Exception e) { } + + if (!isDate) { + // @@@ Add number support + } + + return new RangeQuery(new Term(field, part1), + new Term(field, part2), + inclusive); } public static void main(String[] args) throws Exception { @@ -282,7 +280,7 @@ | <#_WHITESPACE: ( " " | "\t" ) > } - SKIP : { + SKIP : { <<_WHITESPACE>> } @@ -303,14 +301,28 @@ | (<_TERM_CHAR>)* "*" > | (<_TERM_CHAR> | ( [ "*", "?" ] ))* > -| -| +| : RangeIn +| : RangeEx } TOKEN : { )+ ( "." (<_NUM_CHAR>)+ )? > : DEFAULT } + TOKEN : { + +| : DEFAULT +| +| +} + + TOKEN : { + +| : DEFAULT +| +| +} + // * Query ::= ( Clause )* // * Clause ::= ["+", "-"] [ ":"] ( | "(" Query ")" ) @@ -387,7 +399,7 @@ Query Term(String field) : { - Token term, boost=null, slop=null; + Token term, boost=null, slop=null, goop1, goop2; boolean prefix = false; boolean wildcard = false; boolean fuzzy = false; @@ -415,12 +427,29 @@ else q = getFieldQuery(field, analyzer, term.image); } - | ( term= { rangein=true; } | term= ) + | ( ( goop1=|goop1= ) + [ ] ( goop2=|goop2= ) + ) + [ boost= ] + { + if (goop1.kind == RANGEIN_QUOTED) + goop1.image = goop1.image.substring(1, goop1.image.length()-1); + if (goop2.kind == RANGEIN_QUOTED) + goop2.image = goop2.image.substring(1, goop2.image.length()-1); + + q = getRangeQuery(field, analyzer, goop1.image, goop2.image, true); + } + | ( ( goop1=|goop1= ) + [ ] ( goop2=|goop2= ) + ) [ boost= ] { - q = getRangeQuery(field, analyzer, - term.image.substring(1, term.image.length()-1), - rangein); + if (goop1.kind == RANGEEX_QUOTED) + goop1.image = goop1.image.substring(1, goop1.image.length()-1); + if (goop2.kind == RANGEEX_QUOTED) + goop2.image = goop2.image.substring(1, goop2.image.length()-1); + + q = getRangeQuery(field, analyzer, goop1.image, goop2.image, false); } | term= [ slop= ] --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org