Return-Path: Delivered-To: apmail-lucene-general-archive@www.apache.org Received: (qmail 43419 invoked from network); 29 Dec 2009 23:55:56 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 29 Dec 2009 23:55:56 -0000 Received: (qmail 76359 invoked by uid 500); 29 Dec 2009 23:55:55 -0000 Delivered-To: apmail-lucene-general-archive@lucene.apache.org Received: (qmail 76286 invoked by uid 500); 29 Dec 2009 23:55:55 -0000 Mailing-List: contact general-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@lucene.apache.org Delivered-To: mailing list general@lucene.apache.org Received: (qmail 76276 invoked by uid 99); 29 Dec 2009 23:55:55 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Dec 2009 23:55:55 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [68.116.39.62] (HELO rectangular.com) (68.116.39.62) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Dec 2009 23:55:45 +0000 Received: from marvin by rectangular.com with local (Exim 4.63) (envelope-from ) id 1NPluM-0002Z3-W3 for general@lucene.apache.org; Tue, 29 Dec 2009 15:55:23 -0800 Date: Tue, 29 Dec 2009 15:55:22 -0800 To: general@lucene.apache.org Subject: Re: [spatial] Cartesian "Tiers" nomenclature Message-ID: <20091229235522.GA9830@rectangular.com> References: <1e33aedb0912282125k2f6dc673u98584f8ea24854c3@mail.gmail.com> <1e33aedb0912282349y48628cb8q7ddf9e24d4ad1d77@mail.gmail.com> <24ED15CB-2E08-40E8-BBF7-7ABA758AEABF@apache.org> <1e33aedb0912290854je7e6734m4f8d2d04acd44fba@mail.gmail.com> <20091229183211.GA8740@rectangular.com> <1e33aedb0912291114y75010ac5mf6d3813196346d6e@mail.gmail.com> <20091229202947.GA9132@rectangular.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20091229202947.GA9132@rectangular.com> User-Agent: Mutt/1.5.13 (2006-08-11) From: Marvin Humphrey X-Virus-Checked: Checked by ClamAV on apache.org On Tue, Dec 29, 2009 at 12:29:47PM -0800, Marvin Humphrey wrote: > In Lucyland, we've adopted a tradition of recording "brainlogs" > while browsing unfamiliar documentation as a form of UI testing -- I'll do one > of those later. OK, here's the brainlog I recorded while trying to figure out how spatial contrib works. [ BEGIN BRAINLOG ] [ surf to contrib-spatial Javadocs for Lucene 3.0 ] "Support for filtering based upon geographic location." OK, I assume that means we can match a tile and create a posting list for it, then AND the resulting doc id set against other search results. No sample code. Looks like pure reference documentation rather than tutorial style documentation. [ Click on Package org.apache.lucene.spatial.tier ] Lots of red text -- guess they're serious about this not being a stable API. Not clear what to click on next, I'll try DistanceQueryBuilder since Patrick mentioned that. [ Click on Class DistanceQueryBuilder ] Documentation is sparse. The only real meat is in the method documetation: a single sentence plus parameter names. I don't know GeoHashes. I sort of think I understand tierFieldPrefix. (?) And what is "needPrecise"? Hmm, maybe I really need to go find a tutorial somewhere. Let's try the wiki... [ Go to Lucene Java wiki, search for "spatial", get two hits: SpatialLucene, SpatialSearch. ] [ "SpatialLucene" wiki page ] Hmm, there's a big warning which says "refers to content not yet committed"... is that true? Nope, LUCENE-1387 is closed, so this wiki page is out of date. Pff, whatever... OK, I see links to Patrick's white paper. Seems like it will probably be heavier than I want. [ "SpatialSearch" wiki page ] Lots of GeoHash links, should be handy when I try to learn that. And another link to Patrick's whitepaper for the cartesian stuff. I'll try the "full text" search the Wiki search recommends. [ Search Lucene Java wiki for "spatial", this time as "full text" ] Bleah, the Wiki search's performance sucked: "Results 1 - 6 of 6 results out of about 1005 pages. (9.49 seconds)". No interesting results. I'm reluctant to look at a PDF white paper, it will probably be too technical. [ google "lucene spatial tutorial" ] Crap, first hit is an article on Hibernate/Lucene integration. I just want "how to use Lucene spatial". Looks like Lucid's got a webinar from Grant, but I don't feel like sitting down for an hour, I just want some frikkin' sample code. [ google "lucene spatial" ] Blog post at http://sujitpal.blogspot.com/2008/02/spatial-search-with-lucene.html looks promising... Crap, it's not using the spatial contrib. :( OK, there's stuff from Mike McCandless at ... naturally all the indentation is stripped. :( I'll look anyway... OK, I think I basically follow that despite the sucky formatting, but it's not easy. Guess I'm really stuck reading a white paper. :( [ Surf to Patrick's white paper at http://www.nsshutdown.com/projects/lucene/whitepaper/locallucene_v2.html ] Jeez, that's a lot shorter than I expected. Formatting's all messed up and I see some Unicode replacement character glyphs, guess Patrick's not a "web guy" ;) ... But it's probably what I want in terms of content and depth. [ read through first section ] Inclusion, reductionism... sure sure, that's easy enough, it's just query optimization like I deal with all the time. But wait, no code samples. Dammit. :( [ read through "Boundary Box" ] Jeeze, the "giant cross" approach to finding intersection of lat/lon was actually part of a formal spatial package in the past? [ read through "Cartesian Grid" ] The formatting for this section is really messed up. OK, *finally* I see that a cartesian tier is in fact a zoom level. And even Patrick uses the word "grid" extensively. Turns out that algorithmically speaking, local Lucene works almost exactly like I expected it to. I wonder if it's faster to filter by saving the doc ids to a bitset first and filtering off of that, or if it's faster just to use an ANDQuery to join the result set from the matching tiles and the result set for the rest of the query. [ read through "Box ID's" ] Do I really need to know any of this? Box/Tile ID names are arbitrary. Only the query builder that figures out which boxes match a given geographical constraint needs to know. [ END BRAINLOG ] Conclusion # 1: I don't spend a lot of time immersed in Java culture so maybe I missed something, but there seems to be a dearth of high-quality tutorial-style documentation for spatial contrib. I'll save conclusion #2 for a separate email. Marvin Humphrey