lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bill Bell (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (SOLR-2125) Spatial filter is not accurate
Date Tue, 21 Sep 2010 22:44:36 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12913244#action_12913244
] 

Bill Bell edited comment on SOLR-2125 at 9/21/10 6:44 PM:
----------------------------------------------------------

OK so that makes sense. Youa re using distance at 45 degrees. So the east-west would not extend
far enough.

Using http://en.wikipedia.org/wiki/Pythagorean_theorem would help on the east-west case, but
circle or ellipses is MUCH better.

Although extending the 45 degree out would be a conservative estimate. And since we usually
sort by distance asc, those extra points would be in the result set but at the end of the
list. (this is an improvement - again not at good as ellipses).

You need a quick function that tells you "is this lat,long in the circle / ellipses or not".
A range [X to Y] will not get you that. You need to use hsin().

On potential:

1. Do range select using points http://janmatuschek.de/LatitudeLongitudeBoundingCoordinates

(Lat => 1.2393 AND Lat <= 1.5532) AND (Lon >= -1.8184 AND Lon <= 0.4221)
2. Check those points for distance  "in the ellipses".  http://janmatuschek.de/LatitudeLongitudeBoundingCoordinates
 acos(sin(1.3963) * sin(Lat) + cos(1.3963) * cos(Lat) * cos(Lon - (-0.6981))) <= 0.1570;

That should make it fast and simplify the calculations.

UPDATE - NOTE:

Plugging all this into the web site, proves that Pythagorean is a good approximation... 

See Excel attached.

hsin = 309 km from pt to max
hsin = 314 km from pt to min
Estimate using Pythagorean is 311 using sqrt(220km^2+220km^2)

41.42% is the difference from west-east to 45 degree. sqrt(1^2+1^2)

Yonik: sqrt(2) is right - but the spreadsheet is a bit better based on spheres.

The #2 will then subselect the points to limit within that result set.

Therefore, a user could take a distance from the user, sqrt(d^2+d^2) and use that to get a
list - it is not exact but better than nothing.


      was (Author: billnbell):
    OK so that makes sense. Youa re using distance at 45 degrees. So the east-west would not
extend far enough.

Using http://en.wikipedia.org/wiki/Pythagorean_theorem would help on the east-west case, but
circle or ellipses is MUCH better.

Although extending the 45 degree out would be a conservative estimate. And since we usually
sort by distance asc, those extra points would be in the result set but at the end of the
list. (this is an improvement - again not at good as ellipses).

You need a quick function that tells you "is this lat,long in the circle / ellipses or not".
A range [X to Y] will not get you that. You need to use hsin().

On potential:

1. Do range select using points http://janmatuschek.de/LatitudeLongitudeBoundingCoordinates

(Lat => 1.2393 AND Lat <= 1.5532) AND (Lon >= -1.8184 AND Lon <= 0.4221)
2. Check those points for distance  "in the ellipses".  http://janmatuschek.de/LatitudeLongitudeBoundingCoordinates
 acos(sin(1.3963) * sin(Lat) + cos(1.3963) * cos(Lat) * cos(Lon - (-0.6981))) <= 0.1570;

That should make it fast and simplify the calculations.

UPDATE - NOTE:

Plugging all this into the web site, proves that Pythagorean is a good approximation... 

See Excel attached.

hsin = 309 km from pt to max
hsin = 314 km from pt to min
Estimate using Pythagorean is 311 using sqrt(220km^2+220km^2)

41.42% is the difference from west-east to 45 degree. sqrt(1^2+1^2)

The #2 will then subselect the points to limit within that result set.

Therefore, a user could take a distance from the user, sqrt(d^2+d^2) and use that to get a
list - it is not exact but better than nothing.

  
> Spatial filter is not accurate
> ------------------------------
>
>                 Key: SOLR-2125
>                 URL: https://issues.apache.org/jira/browse/SOLR-2125
>             Project: Solr
>          Issue Type: Bug
>          Components: Build
>    Affects Versions: 1.5
>            Reporter: Bill Bell
>            Assignee: Grant Ingersoll
>         Attachments: solrspatial.xlsx
>
>
> The calculations of distance appears to be off.
> Note: "The radius of the sphere to be used when calculating distances on a sphere (i.e.
haversine). Default is the Earth's mean radius in kilometers (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM)
which is set to 3,958.761458084784856. Most applications will not need to set this."
> The radius of the earth in KM is  6371.009 km (≈3958.761 mi).
> Also filtering distance appears to be off - example data:
> 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 miles = 220
kilometers
> http://....../solr/select?fl=*,score&start=0&rows=10&q={!sfilt%20fl=store_lat_lon}&qt=standard&pt=44.9369054,-91.3929348&d=280&sort=dist(2,store,vector(44.9369054,-91.3929348))
asc 
> Nothing shows. d=285 shows results. This is off by a lot.
> Bill

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message