Return-Path: X-Original-To: apmail-incubator-lucy-dev-archive@www.apache.org Delivered-To: apmail-incubator-lucy-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 105CA415C for ; Thu, 23 Jun 2011 08:37:49 +0000 (UTC) Received: (qmail 10150 invoked by uid 500); 23 Jun 2011 08:37:48 -0000 Delivered-To: apmail-incubator-lucy-dev-archive@incubator.apache.org Received: (qmail 10044 invoked by uid 500); 23 Jun 2011 08:37:42 -0000 Mailing-List: contact lucy-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: lucy-dev@incubator.apache.org Delivered-To: mailing list lucy-dev@incubator.apache.org Received: (qmail 10036 invoked by uid 99); 23 Jun 2011 08:37:40 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Jun 2011 08:37:40 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of dmarkham@gmail.com designates 74.125.83.175 as permitted sender) Received: from [74.125.83.175] (HELO mail-pv0-f175.google.com) (74.125.83.175) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Jun 2011 08:37:33 +0000 Received: by pvf24 with SMTP id 24so1080981pvf.6 for ; Thu, 23 Jun 2011 01:37:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:from:mime-version:content-type:subject:date :in-reply-to:to:references:message-id:x-mailer; bh=6XY9fOOPBUFA0Xnsc5jwBjrZtvjKfatvmu9And4obeo=; b=mNq0+WLBPLtQt+tG1UEwhp3IZlbR7P5ugVjaM28+VByrd5bxkvLAzzFH6Al44efJGf FUiARZ0IpcRsQpRb2ZZ4R606W2HKCElZyUqhl55NieTc6iyOYVRYCdTPYiq5hZcwxpcW V+65Iycoy1Zn60amcI3cfrstfGF1VgNApmxqA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:mime-version:content-type:subject:date:in-reply-to:to :references:message-id:x-mailer; b=GUv8xQ1IFMLniZQwR7otsajVhJyEPe8/uNqK1LPwUnhrNcni8R/bobyqrOruo6MUV6 bqZSR25ydT/l2St0tLWRzD0XPByu8JclqXSgZ2FuAbOrHO2PYPAp0YfWPCYY64oNm1Ue Q08wBVVxh6GcHYnFX4wR8sYbKqTX4vPMpMJgY= Received: by 10.68.50.193 with SMTP id e1mr927337pbo.497.1308818233229; Thu, 23 Jun 2011 01:37:13 -0700 (PDT) Received: from [192.168.100.106] (75-32-247-185.lightspeed.rbrnca.sbcglobal.net [75.32.247.185]) by mx.google.com with ESMTPS id z7sm1065321pbk.83.2011.06.23.01.37.10 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 23 Jun 2011 01:37:11 -0700 (PDT) From: Dan Markham Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: multipart/alternative; boundary=Apple-Mail-1--201452261 Date: Thu, 23 Jun 2011 01:37:09 -0700 In-Reply-To: <4E02A2B9.2020503@peknet.com> To: lucy-dev@incubator.apache.org References: <4E002F53.3080507@peknet.com> <20110621181520.GA22200@rectangular.com> <20110623015117.GA6816@rectangular.com> <4E02A2B9.2020503@peknet.com> Message-Id: X-Mailer: Apple Mail (2.1084) Subject: Re: [lucy-dev] RangeQuery and multi-value fields --Apple-Mail-1--201452261 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii The only thing that is a bit different is we encode (bases62) the = numbers of xxxx's in the last digit mainly so the terms are smaller in = length. my @foo =3D encode_trie(100000); print Dumper(\@foo); The output would look like this: $VAR1 =3D [ '1a', ## 1xxxxxxxxxx '129', ## 12xxxxxxxxx '1208', =20 '12007', '120026', '1200205', '12002014', '120020113', '1200201122', '12002011201', '120020112010' ## the exact match for base 3 @ 100000 ]; So you really only use encode_trie(int) to build the terms to index and =20= query_trie( minint, maxint ) for search terms at query time. few things i'm pretty sure need some love are: 1. encode() and qery_trie() are hard coded for base3. 2. If the length if your trie gets longer than 62 chars the cute disk = saving trick above will surely not work. enjoy, -Dan On Jun 22, 2011, at 7:19 PM, Peter Karman wrote: > Marvin Humphrey wrote on 6/22/11 8:51 PM: >>> On Tue, Jun 21, 2011 at 12:42:43AM -0500, Peter Karman wrote: >>>> I want to override the behavior of the RangeQuery class to support = my pseudo >>>> multi-value fields, which I achieve by concatenating values with = the \x03 byte. >>=20 >> OK, there's another option which has suddenly become more attractive. = :) My >> Eventful colleague Dan Markham has submitted a trie implementation = that can be >> used for generating numeric ranges: >>=20 >> https://issues.apache.org/jira/browse/LUCY-159 >>=20 >> It is to some degree based on the algorithm used by Lucene's = NumericRangeQuery: >>=20 >> http://s.apache.org/QOx >>=20 >=20 > Thanks to both you and Dan for this contribution! >=20 > I'll have a look at the code and the docs and see if it feels workable = for my > particular need. In any case, I think it's great to see contributions = like > these, expanding the Lucy ecosystem. >=20 >=20 > --=20 > Peter Karman . http://peknet.com/ . peter@peknet.com --Apple-Mail-1--201452261--