Return-Path: Delivered-To: apmail-lucene-general-archive@www.apache.org Received: (qmail 91727 invoked from network); 30 Dec 2009 18:28:39 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 30 Dec 2009 18:28:39 -0000 Received: (qmail 96499 invoked by uid 500); 30 Dec 2009 18:28:38 -0000 Delivered-To: apmail-lucene-general-archive@lucene.apache.org Received: (qmail 96425 invoked by uid 500); 30 Dec 2009 18:28:37 -0000 Mailing-List: contact general-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@lucene.apache.org Delivered-To: mailing list general@lucene.apache.org Received: (qmail 96415 invoked by uid 99); 30 Dec 2009 18:28:37 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Dec 2009 18:28:37 +0000 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [85.25.71.29] (HELO mail.troja.net) (85.25.71.29) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Dec 2009 18:28:27 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.troja.net (Postfix) with ESMTP id 67A4A45F3A3 for ; Wed, 30 Dec 2009 19:28:07 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at mail.troja.net Received: from mail.troja.net ([127.0.0.1]) by localhost (megaira.troja.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id biBF2QHpgi+0 for ; Wed, 30 Dec 2009 19:27:58 +0100 (CET) Received: from VEGA (dslb-088-065-111-032.pools.arcor-ip.net [88.65.111.32]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.troja.net (Postfix) with ESMTPSA id 0892A45F3A1 for ; Wed, 30 Dec 2009 19:27:57 +0100 (CET) From: "Uwe Schindler" To: References: <1e33aedb0912282125k2f6dc673u98584f8ea24854c3@mail.gmail.com> <1e33aedb0912282349y48628cb8q7ddf9e24d4ad1d77@mail.gmail.com> <24ED15CB-2E08-40E8-BBF7-7ABA758AEABF@apache.org> <1e33aedb0912290854je7e6734m4f8d2d04acd44fba@mail.gmail.com> <20091229183211.GA8740@rectangular.com> <1e33aedb0912291114y75010ac5mf6d3813196346d6e@mail.gmail.com> <20091229202947.GA9132@rectangular.com> <20091229235522.GA9830@rectangular.com> <20091230001324.GB9830@rectangular.com> <9ac0c6aa0912300401x31faf3b2t72deda9df6ba83d5@mail.gmail.com> Subject: RE: [spatial] Cartesian "Tiers" nomenclature Date: Wed, 30 Dec 2009 19:27:57 +0100 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Mailer: Microsoft Office Outlook 11 In-Reply-To: <9ac0c6aa0912300401x31faf3b2t72deda9df6ba83d5@mail.gmail.com> Thread-Index: AcqJR+FaWxoSvgs8Q66QOWu7FuB3JwANH/AQ X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5579 X-Virus-Checked: Checked by ClamAV on apache.org Hi Mike, > Right, NRQ is able to translate any requested range into the union > (OR) of brackets (from the trie) created during indexing. >=20 > Can spatial do the same thing, just with 2D instead of 1D? Ie, > reconstruct any expressible shape (created at query time) as the union > of some number of grids/tiers, at finer & finer levels, created during > indexing? >=20 > Spatial, today, seems to do this, except it must also do "precise" > filtering on each matching doc, because some of the grids may contain > hits outside of the requested shape. The problem in 2D is, that the number of small cart tiers (rect tiles) = at the corners of the BBOX can get very large. For NRQ it's very limited on precisionStep, but here, the number of tiles can get very large for = highest precision (you can draw a picture to see this). So the number of terms = will get large, too, and so you can simply use NRQ in two dimensions to = achieve the same for non quadratic shaped bounding boxes. I tried to use spatial tiers for large and non quadratic bounding boxes (like all datasets = around Africa). NRQ performed much better. Spatial is good for things like = "find all McBurger around position xy with distance d", but not for bounding = box searches. Because of this the precision is limited for tiers, you filter the rest = of the docs. If the precision would be higher, the number of cart tiers explodes at the corner of the BBOX. > In fact, NRQ could also borrow from spatial's current approach -- ie, > create the union of some smallish number of coarse brackets. Some of > the brackets will fall entirely within the requested range, and so > require no further filtering, while others will fall part inside / > part outside of the requested range, and so will require precise > filtering. If NRQ did this, it should have much fewer postings to > enum, at the cost of having to do precise filtering on some of them > (and we'd have to somehow encode the orig value in the index). We thought about that for NRQ, too. It is still needed that you store = the full precision in the index. The idea is to use flexible indexing = attributes and/or payloads for this. You select the larger brackets and then go = through TermPositions and only enumerate docs, which payload/flexindex attribute match. Another idea was provided by Yonik who said, the position could = be used instead of payload for the missing bits. The posincr is currently = not used by NRQ (its 1 for full prec and 0 for lower prec, so all trie terms have pos=3D1). > Mike >=20 > On Tue, Dec 29, 2009 at 8:42 PM, Yonik Seeley > wrote: > > On Tue, Dec 29, 2009 at 7:13 PM, Marvin Humphrey > wrote: > >> ... but for this algorithm, different rasterization resolutions = need > not > >> proceed by powers-of-two. > > > > Indeed - one way to further generalize would be to use something = like > > Lucene's trie-based Numeric field, but with a square instead of a > > line. =A0That would allow to tweak the space/speed tradeoff. > > > > -Yonik > > http://www.lucidimagination.com > >