lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "SpatialSearch" by GrantIngersoll
Date Tue, 15 Feb 2011 12:37:58 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "SpatialSearch" page has been changed by GrantIngersoll.
http://wiki.apache.org/solr/SpatialSearch?action=diff&rev1=53&rev2=54

--------------------------------------------------

  
  Since the !LatLonType field also supports field queries and range queries, so one can manually
create their own bounding box rather than using bbox:
  
-  . [[http://localhost:8983/solr/select?wt=json&indent=true&fl=name,store&q=*:*&fq=store:[45,-94%20TO%2046,-93]|...&q=*:*&fq=store:[45,-94
TO 46,-93] ]]
+  . [[http://localhost:8983/solr/select?wt=json&indent=true&fl=name,store&q=*:*&fq=store:[45,-94%20TO%2046,-93]|...&q=*:*&fq=store:[45,-94
TO 46,-93]]]
  
  == geodist - The distance function ==
  '''geodist''' is a function query that yields the calculated distance.  This gives the flexibility
to do a number of interesting things, such as sorting by the distance (Solr can sort by any
function query), or combining the distance with the relevancy score, such as boosting by the
inverse of the distance.
@@ -87, +87 @@

  == How to combine with a sub-query to expand results ==
  It is possible to filter by other criteria with an OR clause. Here is an example that says
return by Jacksonville, FL or within 50 km from 45.15,-93.85:
  
-  . [[http://localhost:8983/solr/select?fl=name,store,city,state&q=*:*&fq=(state:"FL"
AND city:"Jacksonville") OR _query_:"{!geofilt}"&sfield=store&pt=45.15,-93.85&d=50&sort=geodist()%20asc|...&q=*:*&fq=(state:"FL"
AND city:"Jacksonville") OR _query_:"{!geofilt}"&sfield=store&pt=45.15,-93.85&d=50&sort=geodist()
asc]]
+  . [[http://localhost:8983/solr/select?fl=name,store,city,state&q=*:*&fq=(state:"FL"%20AND%20city:"Jacksonville")%20OR%20_query_:"{!geofilt}"&sfield=store&pt=45.15,-93.85&d=50&sort=geodist()%20asc|...&q=*:*&fq=(state:"FL"
AND city:"Jacksonville") OR _query_:"{!geofilt}"&sfield=store&pt=45.15,-93.85&d=50&sort=geodist()
asc]]
+ 
+ == How to facet by distance ==
+ Faceting by distance can be done using the frange QParser.  Unfortunately, right now, it
is a bit inefficient, but it likely will be fine in most situations:
+ 
+  . [[http://localhost:8983/solr/select?&q=*:*&sfield=store&pt=45.15,-93.85&facet.query={!frange%20l=0%20u=5}geodist%28%29&facet.query={!frange%20l=5.001%20u=3000}geodist%28%29&wt=xml&facet=true|&q=*:*&sfield=store&pt=45.15,-93.85&facet.query={!frange
l=0 u=5}geodist()&facet.query={!frange l=5.001 u=3000}geodist()]]
  
  == How to boost closest results (with dismax) ==
  It is possible also boost the query by closest results by combining bq with geodist():
  
-  . [[http://localhost:8983/solr/select?fl=name,store,score&defType=dismax&q.alt=*:*&fq={!geofilt}&sfield=store&pt=45.15,-93.85&d=50&bq=_val_:"recip(geodist(),
2, 200, 20)"^1.0&sort=score%20desc|...&defType=dismax&q.alt=*:*&fq={!geofilt}&sfield=store&pt=45.15,-93.85&d=50&bq=_val_:"recip(geodist(),
2, 200, 20)"^1.0&sort=score desc]]
+  . [[http://localhost:8983/solr/select?fl=name,store,score&defType=dismax&q.alt=*:*&fq={!geofilt}&sfield=store&pt=45.15,-93.85&d=50&bq=_val_:"recip(geodist(),%202,%20200,%2020)"^1.0&sort=score%20desc|...&defType=dismax&q.alt=*:*&fq={!geofilt}&sfield=store&pt=45.15,-93.85&d=50&bq=_val_:"recip(geodist(),
2, 200, 20)"^1.0&sort=score desc]]
+ 
+ = Advanced Spatial Options =
+ ''''''Solr also supports other spatial capabilities beyond just latitude and longitude.
 For example, a PointType can be used to represent a point in an n-dimensional space.  This
can be useful, for instance, for searching in CAD drawings or blueprints.  Solr also supports
other distance measures.  See the FunctionQuery page for more information and look for hsin,
ghhsin and others.
+ 
+ == Field Types ==
+ === PointType ===
+ {{{
+ <fieldType name="location" class="solr.PointType" dimension="2" subFieldSuffix="_d"/>
+ }}}
+ === LatLonType ===
+ The !LatLonType combines a latitude/longitude point.  All input is interpreted as latitude
then longitude.  The LatLonType is similar to !PointType, but it does distance calculations
based on Great Circle (haversine) and is only two dimensional (lat/lon).
+ 
+ ==== Example ====
+ {{{
+ <fieldType name="latLon" class="solr.LatLonType" subFieldSuffix="_latLon"/>
+ ...
+ <field name="store_lat_lon" type="latLon" indexed="true" stored="true"/>
+ }}}
+ === Geohash ===
+ A geohash is a way of encoding lat/lon into a single field as a String.  As of https://issues.apache.org/jira/browse/SOLR-1586,
it will be possible to create a geohash via FieldType, simply by passing in a Point (lat,lon).
 Solr will do the work of converting the point to a geohash.
+ 
+ See http://www.geohash.org and http://en.wikipedia.org/wiki/Geohash
+ 
+ If you need to index a multi-valued point field, say because you have a variable number
of points per document, then check out https://issues.apache.org/jira/browse/SOLR-2155 which
uses a hierarchical grid geohash prefix technique to efficiently filter documents by a geographic
shape.
+ 
+ ==== Example ====
+ {{{
+ <fieldtype name="geohash" class="solr.GeoHashField"/>
+ ...
+ <field name="store_hash" type="geohash" indexed="true" stored="false"/>
+ }}}
+ = Indexing =
+ Indexing is handled by the various !FieldType  instances in the schema.  At the most basic,
the user can represent  their own spatial data using ints, floats or doubles.  Beyond that,
the !PointType, !GeoHashField and !LatLonType can be used to index spatial information automatically.
+ 
+ When indexing, the format is something like:
+ 
+ {{{
+ <field name="store_lat_lon">12.34,-123.45</field>
+ }}}
+ (It can vary based on the number of values.  When using a !LatLonType or a !GeoHashField,
it is always latitude, then longitude.
+ 
+ = Filtering =
+ There are several different ways to filter in spatial search:
+ 
+  1. By Range Query, as in {{{fq=lat:[-23.0 TO -79.5] AND lon:[56.3 TO 60.3]}}} -- Already
implemented
+  1. By the Spatial Filter QParser (!SpatialQParser)  - e.g. {!sfilt fl=location}&pt=49.32,-79.0&d=20
+  1. Using the "frange" QParser, as in {{{fq={!frange l=0 u=400}hsin(0.57, -1.3, lat_rad,
lon_rad, 3963.205)}}}
+ 
+ In  practice, for those using Solr's field types above, the Spatial Filter  !QParser will
automatically make the correct decision about how best to  filter.  If an application needs
a specific type of filtering for  performance or other needs, the best bet is to extend the
!FieldType in question with your own needs.
+ 
+ == Spatial Filter QParser ==
+ See https://issues.apache.org/jira/browse/SOLR-1568.
+ 
+ ''NOTE: Depending on the !FieldType, different calculations for distance will be applied''.
 For instance, the !PointType uses a rectangular coordinate system and uses the Euclidean
distance while !LatLonType uses Haversine by default.
+ 
+ See !SpatialFilterTest for examples of the various points.
+ 
+ The following parameters are supported:
+ ||<tablewidth="725px" tableheight="184px">Parameter ||Description ||Example ||
+ ||pt ||The Point to use as the center of the filter.  Specified as a comma separated list
of doubles.  If using the !LatLonType, then it is lat,lon. ||&pt=33.4,29.0 &pt=27.3,83.9,10.0,5.5
||
+ ||d ||The distance from the point to the outer edge  of whatever is being used to filter
on (bounding box, pure distance,  something else).  Must be greater than or equal to 0 ||&d=10.0
||
+ ||sphere_radius ||The radius of the sphere to be used when  calculating distances on a sphere
(i.e. haversine).  Default is the  Earth's mean radius in kilometers (see org.apache.lucene.spatial.DistanceUtils.EARTH_MEAN_RADIUS_KM)
which is set to 6371.009.  Most applications will not need to set this. ||&sphere_radius=10.3
||
+ ||meas ||NOTE: This value is experimental and subject to removal.  Most applications will
not need to change the measure.  The !FieldTypes  usually make the proper choice given the
data stored. The distance  measure to use when calculating distance.  The default is dependent
on  the FieldType.  Supported values are: 1. hsin - The haversine 2. 0, 1, 2, ... INF for
the appropriate p-norm (2 is the Euclidean Distance) ||&meas=hsin. ||
  
  
- = Advanced Spatial Options - Under Development =
- SpatialSearchDev  -- Covers things like Geohash (supports multivalue lat-lon points), North/South
Pole issues, other distance functions, etc.
  
+ 
+ For !LatLonType,  the sfilt command calculates a bounding box by calculating the East and
 West Longitudes and the North and South Latitudes of a box that  transcribes the circle with
radius d (using hsin).  There are other ways  that this can be implemented by overriding the
createSpatialQuery  method on !LatLonType.
+ 
+ For !PointType, the bounding box is calculated by standard rectangular coordinate system
measures.
+ 
+ == Filtering Caveats ==
+ === North/South Poles ===
+ When the bounding box includes a Pole, then the !LatLonType will automatically switch from
producing a bounding box to a "bounding bowl" (i.e. a spherical cap: http://mathworld.wolfram.com/SphericalCap.html)
 whereby it will include all values that are North or South of the  latitude of the would
be bounding box (the lower left and the upper  right) that is closer to the equator.  In other
words, we still  calculate what the coordinates of the upper right corner and the lower  left
corner of the box would be just as in all other filtering cases,  but we then take the corner
that is closest to the equator (since it  goes over the pole it may not be the lower left,
despite the name) and  do a latitude only filter.  Obviously, this means there will be more
 matches than a pure bounding box match, but the query is much easier to  construct and will
likely be faster, too.
+ 
+ = Sorting =
+ https://issues.apache.org/jira/browse/SOLR-1297  added the ability to sort by function,
so sorting by distance is now  simply a matter of sorting using the appropriate distance function,
just  like boosting.
+ 
+ = Scoring =
+ Scoring by distance works just like any other FunctionQuery.  See the distance methods on
the FunctionQuery page for examples and method signatures.
+ 
+ = Query Parsing =
+ <!> TODO <!>
+ 
+ https://issues.apache.org/jira/browse/SOLR-1578 See https://issues.apache.org/jira/browse/SOLR-1568
+ 
+ = Other Caveats =
+ Unless otherwise specified, all units are kilometers.
+ 
+ = Known Issues =
+ See https://issues.apache.org/jira/browse/SOLR-773 for tracking
+ 
+ = Useful References =
+  1. http://www.movable-type.co.uk/scripts/latlong.html
+  1. http://www.ibm.com/developerworks/opensource/library/j-spatial/index.html
+  1. http://www.nsshutdown.com/projects/lucene/whitepaper/locallucene_v2.html
+ 

Mime
View raw message