lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "SpatialSearchDev" by YonikSeeley
Date Thu, 30 Sep 2010 22:18:41 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "SpatialSearchDev" page has been changed by YonikSeeley.
The comment on this change is: move stuff that hasn't been vetted yet to a dev page - there
have been too many complaints about the confusing spatial page.
http://wiki.apache.org/solr/SpatialSearchDev

--------------------------------------------------

New page:
Spatial Search (docs + features under development)

==== Example ====
{{{
<fieldType name="location" class="solr.PointType" dimension="2" subFieldSuffix="_d"/>
...
<field name="store" type="location" indexed="true" stored="true"/>
}}}
=== LatLonType ===
The !LatLonType combines a latitude/longitude point.  All input is interpreted as latitude
then longitude.  The LatLonType is similar to !PointType, but it does distance calculations
based on Great Circle (haversine) and is only two dimensional (lat/lon).

==== Example ====
{{{
<fieldType name="latLon" class="solr.LatLonType" subFieldSuffix="_latLon"/>
...
<field name="store_lat_lon" type="latLon" indexed="true" stored="true"/>
}}}
=== Geohash ===
A geohash is a way of encoding lat/lon into a single field as a String.  As of https://issues.apache.org/jira/browse/SOLR-1586,
it will be possible to create a geohash via FieldType, simply by passing in a Point (lat,lon).
 Solr will do the work of converting the point to a geohash.

See http://www.geohash.org and http://en.wikipedia.org/wiki/Geohash

==== Example ====
{{{
<fieldtype name="geohash" class="solr.GeoHashField"/>
...
<field name="store_hash" type="geohash" indexed="true" stored="false"/>
}}}
= Indexing =
Indexing is handled by the various !FieldType instances in the schema.  At the most basic,
the user can represent their own spatial data using ints, floats or doubles.  Beyond that,
the !PointType, !GeoHashField and !LatLonType can be used to index spatial information automatically.

When indexing, the format is something like:

{{{
<field name="store_lat_lon">12.34,-123.45</field>
}}}
(It can vary based on the number of values.  When using a !LatLonType or a !GeoHashField,
it is always latitude, then longitude.

= Filtering =
There are several different ways to filter in spatial search:

 1. By Range Query, as in {{{fq=lat:[-23.0 TO -79.5] AND lon:[56.3 TO 60.3]}}} -- Already
implemented
 1. By the Spatial Filter QParser (!SpatialQParser)  - e.g. {!sfilt fl=location}&pt=49.32,-79.0&d=20
 1. Using the "frange" QParser, as in {{{fq={!frange l=0 u=400}hsin(0.57, -1.3, lat_rad, lon_rad,
3963.205)}}}

In practice, for those using Solr's field types above, the Spatial Filter !QParser will automatically
make the correct decision about how best to filter.  If an application needs a specific type
of filtering for performance or other needs, the best bet is to extend the !FieldType in question
with your own needs.

== Spatial Filter QParser ==
See https://issues.apache.org/jira/browse/SOLR-1568.

''NOTE: Depending on the !FieldType, different calculations for distance will be applied''.
 For instance, the !PointType uses a rectangular coordinate system and uses the Euclidean
distance while !LatLonType uses Haversine by default.

See !SpatialFilterTest for examples of the various points.

The following parameters are supported:
||<tablewidth="725px" tableheight="184px">Parameter ||Description ||Example ||
||pt ||The Point to use as the center of the filter.  Specified as a comma separated list
of doubles.  If using the !LatLonType, then it is lat,lon. ||&pt=33.4,29.0 &pt=27.3,83.9,10.0,5.5
||
||d ||The distance from the point to the outer edge of whatever is being used to filter on
(bounding box, pure distance, something else).  Must be greater than or equal to 0 ||&d=10.0
||
||sphere_radius ||The radius of the sphere to be used when calculating distances on a sphere
(i.e. haversine).  Default is the Earth's mean radius in kilometers (see org.apache.lucene.spatial.DistanceUtils.EARTH_MEAN_RADIUS_KM)
which is set to 6371.009.  Most applications will not need to set this. ||&sphere_radius=10.3
||
||meas ||NOTE: This value is experimental and subject to removal.  Most applications will
not need to change the measure.  The !FieldTypes usually make the proper choice given the
data stored. The distance measure to use when calculating distance.  The default is dependent
on the FieldType.  Supported values are: 1. hsin - The haversine 2. 0, 1, 2, ... INF for the
appropriate p-norm (2 is the Euclidean Distance) ||&meas=hsin. ||




For !LatLonType, the sfilt command calculates a bounding box by calculating the East and West
Longitudes and the North and South Latitudes of a box that transcribes the circle with radius
d (using hsin).  There are other ways that this can be implemented by overriding the createSpatialQuery
method on !LatLonType.

For !PointType, the bounding box is calculated by standard rectangular coordinate system measures.

== Filtering Caveats ==
=== North/South Poles ===
When the bounding box includes a Pole, then the !LatLonType will automatically switch from
producing a bounding box to a "bounding bowl" (i.e. a spherical cap: http://mathworld.wolfram.com/SphericalCap.html)
whereby it will include all values that are North or South of the latitude of the would be
bounding box (the lower left and the upper right) that is closer to the equator.  In other
words, we still calculate what the coordinates of the upper right corner and the lower left
corner of the box would be just as in all other filtering cases, but we then take the corner
that is closest to the equator (since it goes over the pole it may not be the lower left,
despite the name) and do a latitude only filter.  Obviously, this means there will be more
matches than a pure bounding box match, but the query is much easier to construct and will
likely be faster, too.

= Sorting =
https://issues.apache.org/jira/browse/SOLR-1297 added the ability to sort by function, so
sorting by distance is now simply a matter of sorting using the appropriate distance function,
just like boosting.

= Scoring =
Scoring by distance works just like any other FunctionQuery.  See the distance methods on
the FunctionQuery page for examples and method signatures.

= Query Parsing =
<!> TODO <!>

https://issues.apache.org/jira/browse/SOLR-1578 See https://issues.apache.org/jira/browse/SOLR-1568

= Other Caveats =
Unless otherwise specified, all units are kilometers.

= Known Issues =
See https://issues.apache.org/jira/browse/SOLR-773 for tracking

= Useful References =
 1. http://www.movable-type.co.uk/scripts/latlong.html
 1. http://www.ibm.com/developerworks/opensource/library/j-spatial/index.html
 1. http://www.nsshutdown.com/projects/lucene/whitepaper/locallucene_v2.html

Mime
View raw message