lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "patrick o'leary" <pj...@pjaol.com>
Subject Re: [spatial] Cartesian "Tiers" nomenclature
Date Wed, 30 Dec 2009 19:17:20 GMT
>
>
> Because of this the precision is limited for tiers, you filter the rest of
> the docs. If the precision would be higher, the number of cart tiers
> explodes at the corner of the BBOX.
>

Hmm, depends on the order of the search, if you use bestFit is should not
explode.
The strategy for bestFit is to find the smallest number of boxes between the
corner of the bounding box and the circle.

CartesianTierPlotter.bestFit(double miles)

  /**
   * Find the tier with the best fit for a bounding box
   * Best fit is defined as the ceiling of
   *  log2 (circumference of earth / distance)
   *  distance is defined as the smallest box fitting
   *  the corner between a radius and a bounding box.

Now in the new structure which include polygon searching you supply vertex
points, and cartesian tiers fill in the rest, if it's a convex hull
it will switch to range searches, if a line string it simply pulls the
individual boxes. So it removes the need for a regular box shape.




On Wed, Dec 30, 2009 at 10:27 AM, Uwe Schindler <uwe@thetaphi.de> wrote:

> Hi Mike,
>
> > Right, NRQ is able to translate any requested range into the union
> > (OR) of brackets (from the trie) created during indexing.
> >
> > Can spatial do the same thing, just with 2D instead of 1D?  Ie,
> > reconstruct any expressible shape (created at query time) as the union
> > of some number of grids/tiers, at finer & finer levels, created during
> > indexing?
> >
> > Spatial, today, seems to do this, except it must also do "precise"
> > filtering on each matching doc, because some of the grids may contain
> > hits outside of the requested shape.
>
> The problem in 2D is, that the number of small cart tiers (rect tiles) at
> the corners of the BBOX can get very large. For NRQ it's very limited on
> precisionStep, but here, the number of tiles can get very large for highest
> precision (you can draw a picture to see this). So the number of terms will
> get large, too, and so you can simply use NRQ in two dimensions to achieve
> the same for non quadratic shaped bounding boxes. I tried to use spatial
> tiers for large and non quadratic bounding boxes (like all datasets around
> Africa). NRQ performed much better. Spatial is good for things like "find
> all McBurger around position xy with distance d", but not for bounding box
> searches.
>
> Because of this the precision is limited for tiers, you filter the rest of
> the docs. If the precision would be higher, the number of cart tiers
> explodes at the corner of the BBOX.
>
> > In fact, NRQ could also borrow from spatial's current approach -- ie,
> > create the union of some smallish number of coarse brackets.  Some of
> > the brackets will fall entirely within the requested range, and so
> > require no further filtering, while others will fall part inside /
> > part outside of the requested range, and so will require precise
> > filtering.  If NRQ did this, it should have much fewer postings to
> > enum, at the cost of having to do precise filtering on some of them
> > (and we'd have to somehow encode the orig value in the index).
>
> We thought about that for NRQ, too. It is still needed that you store the
> full precision in the index. The idea is to use flexible indexing
> attributes
> and/or payloads for this. You select the larger brackets and then go
> through
> TermPositions and only enumerate docs, which payload/flexindex attribute
> match. Another idea was provided by Yonik who said, the position could be
> used instead of payload for the missing bits. The posincr is currently not
> used by NRQ (its 1 for full prec and 0 for lower prec, so all trie terms
> have pos=1).
>
> > Mike
> >
> > On Tue, Dec 29, 2009 at 8:42 PM, Yonik Seeley
> > <yonik@lucidimagination.com> wrote:
> > > On Tue, Dec 29, 2009 at 7:13 PM, Marvin Humphrey
> > <marvin@rectangular.com> wrote:
> > >> ... but for this algorithm, different rasterization resolutions need
> > not
> > >> proceed by powers-of-two.
> > >
> > > Indeed - one way to further generalize would be to use something like
> > > Lucene's trie-based Numeric field, but with a square instead of a
> > > line.  That would allow to tweak the space/speed tradeoff.
> > >
> > > -Yonik
> > > http://www.lucidimagination.com
> > >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message