lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: [SPATIAL] Best Fit Calculation
Date Wed, 14 Apr 2010 17:12:50 GMT

On Apr 14, 2010, at 12:12 PM, Chris Male wrote:

> Hi,
> 
> On Wed, Apr 14, 2010 at 6:07 PM, Grant Ingersoll <gsingers@apache.org> wrote:
> 
> On Apr 14, 2010, at 11:06 AM, Chris Male wrote:
> 
> > Hi,
> >
> > My understanding of the benefits of the new algorithm is that it means a lower tier
level resulting in fewer boxes, but more documents inside those boxes that are outside of
the search radius.
> >
> > While having fewer boxes means fewer term queries to make against the index, more
documents means more costly calculations to filter out those extraneous documents.
> >
> > For those doing just Cartesian Tier filtering it seems like the new approach is
a win, but for those doing distance calculations on those documents passing the filter, it
seems to come at a cost.
> 
> Currently, this is only used for filtering.  AIUI, Tiers aren't really that useful for
distance calculations, are they?  After all, all you have is a box id and you'd have to reverse
out the calc of that to be able to calc a distance, no?  Perhaps I'm missing something.
> 
> 
> How Spatial Lucene currently works (or at least one of the ways it was designed to work),
is using a 2 step filtering process.  Step 1 is the Cartesian Tier filtering.  The resulting
set of Documents is then passed on through to Step 2 which then calculates the distance from
each Document to the search centre.  If the distance is greater than the radius, the Document
is filtered out.  This means that after both filtering steps you have only those Documents
that are in the search radius.
> 
> How this impacts this algorithm choice is that the more Documents the pass through Step
1, the more calculations that have to be done in Step 2.

OK, I see what you mean now.  I thought you were implying the box id would be used for calculating
a distance, too.
Mime
View raw message