lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Berryman (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4922) A SpatialPrefixTree based on the Hilbert Curve and variable grid sizes
Date Thu, 09 May 2013 19:45:17 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13653095#comment-13653095
] 

John Berryman commented on LUCENE-4922:
---------------------------------------

Hmmm... integer representation huh. Well here's a thought then:

As a first got at this idea, let's define something like a geohash where x are interleaved,
but here's how we do it. At the top level, number squares from 0 to 3.

{noformat}
   0 --- 1
   |     |
   2 --- 3
{noformat}

At the next level, number thing similarly, 

{noformat}
    00 -- 01 -- 10 -- 11
    |     |     |     |
    02 -- 03 -- 12 -- 13
    |     |     |     |
    20 -- 21 -- 30 -- 31
    |     |     |     |
    22 -- 23 -- 32 -- 33
{noformat}

Even though this *looks* like the hilbert thing I did above, notice that this is actually
the Z-ordering and it's much easier to compute.

In this case, the first two bits encodes which of the four big boxes the point is in, the
next two bits encodes which of the four sub boxes the point is in, etc. So for example [0.375,
0.625] would be encoded to a depth of 2 by "03" which can be stored in half a byte.

Got it? So... now since we have the original point encoded in z-ordering. We can create a
new hilbert_point algorithm that takes a byte array representing the z-ordering encoding of
a point rather than a 2-vector of doubles. And the code looks much the same except that instead
of the "val[0]/2" we're actually just iterating through the byte array 2 bits at a time (with
no backtracking or lookahead).

This would make for some exquisitely indecipherable code. But ultimately it might not help
that much - it largely depends upon how complex the z-ordering encoding is.
                
> A SpatialPrefixTree based on the Hilbert Curve and variable grid sizes
> ----------------------------------------------------------------------
>
>                 Key: LUCENE-4922
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4922
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/spatial
>            Reporter: David Smiley
>            Assignee: David Smiley
>              Labels: gsoc2013, mentor, newdev
>
> My wish-list for an ideal SpatialPrefixTree has these properties:
> * Hilbert Curve ordering
> * Variable grid size per level (ex: 256 at the top, 64 at the bottom, 16 for all in-between)
> * Compact binary encoding (so-called "Morton number")
> * Works for geodetic (i.e. lat & lon) and non-geodetic
> Some bonus wishes for use in geospatial:
> * Use an equal-area projection such that each cell has an equal area to all others at
the same level.
> * When advancing a grid level, if a cell's width is less than half its height. then divide
it as 4 vertically stacked instead of 2 by 2. The point is to avoid super-skinny cells which
occurs towards the poles and degrades performance.
> All of this requires some basic performance benchmarks to measure the effects of these
characteristics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message