lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <>
Subject [Solr Wiki] Update of "SolrAdaptersForLuceneSpatial4" by DavidSmiley
Date Thu, 28 Jun 2012 18:18:49 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "SolrAdaptersForLuceneSpatial4" page has been changed by DavidSmiley:

the rest.

  == Configuration ==
+ First, you must register a spatial field type in the Solr schema.xml file.  The instructions
in this whole document imply the RecursivePrefixTreeStrategy based field type used in a geospatial
+ {{{
+     <fieldType name="geo"   class="org.apache.solr.spatial.RecursivePrefixTreeFieldType"
+                spatialContextFactory="com.spatial4j.core.context.jts.JtsSpatialContextFactory"
+                distErrPct="0.025"
+                maxDetailDist="0.001"
+             />
+ }}}
+ The XML attributes are parameters for configuring the field type:
+  * spatialContextFactory: If polygons or WKT formatted shape support is needed, then use
the JTS based class as shown above, otherwise this can be omitted.  The JTS jar file must
be on Solr's classpath as well.
+  * distErrPct="0.025": When indexing shapes other than points, this is used to specify the
default precision of a shape's area which is basically pixelated on an indexed grid.  This
number is approximated as the fraction of a shape's average radius.  The name of the parameter
suggests it is a percentage but this is in error, it is a fraction between 0 and 1.  The closer
this number is to zero, the more space an indexed shape will take up and will take longer
to index.
+  * maxDetailDist="0.001": The highest level of detail indexed, expressed in kilometers,
i.e. the minimum required precision.  The actual detail will be even more precise than this,
and it will be higher towards the poles.  Internally, this number is used to derive a "maxLevels"
trie length for the trie encoding used, which is logged at startup.
+ There are other parameters not yet documented as they are more obscure, such as using for
non-geo, using other distance calculation formulas, other default units besides km, internal
trie encodings other than geohash.
+ And finally, specify a field that uses this field type:
+ {{{   <field name="geo"  type="geo"  indexed="true" stored="true"  multiValued="true"
/>  }}}
+ A key feature of the new spatial module is multi-value support but you certainly aren't
required to declare the field multiValued if it isn't.
  == Indexing ==
+ Points are indexed just as they are in Solr 3 spatial:
+ {{{	<field name="geo">43.17614,-90.57341</field> }}}
+ If a comma is omitted, then it is in x-y (lon-lat) order:
+ {{{	<field name="geo">-90.57341 43.17614</field> }}}
+ A lat-lon rectangle can be indexed with 4 numbers in minX maxX minY maxY order:
+ {{{	<field name="geo">-74.093 41.042 -69.347 44.558</field> }}}
+ A circle is specified like so:
+ {{{	<field name="geo">Circle(4.56,1.23  distance=7.89)</field> }}}
+ The first part of it is the center point, in either "lon lat" or "lat,lon" format, then
the distance in km.  "d" can be used to abbreviate "distance".
+ For polygons, use the WKT standard (Well Known Text) like so:
+ {{{	 <field name="geo">POLYGON((-10 30, -40 40, -10 -20, 40 20, 0 0, -10 30))</field>
+ In WKT, coordinates are in "x y" (lon lat) order, and the coordinates are each separated
by commas.
+ == Shape / Polygon / WKT notes ==
+  * Only Polygon, and Multipolygon WKT types have been tested.  GeometryCollection will not
work but the others should in theory.  Holes in polygons haven't been tested but may work.
+  * The implementation doesn't support WKT that encompasses a pole.  The only shape that
can encompass a pole is a Circle.  Technically a longitude-wrapping (-180 to +180) lat-lon
box that touches a pole will too though.
+  * Polygons and other WKT must have each vertex less than 180 degrees in longitude difference
than the vertex before it, or else it will be confused as going the wrong way around the globe.
 Dateline crossing is supported.
+  * All input coordinates are normalized into the standard geospatial lat-lon boundaries.
 So, -184 longitude becomes +176, for example.  Both +180 and -180 are kept distinct.
  == Search ==
+ Searching with the new spatial module is used significantly different than Solr 3 spatial.
 Here is a Solr filter query parameter for a lat-lon bounding box:
+ {{{	fq={!needScore=false}geo:"Intersects(-74.093 41.042 -69.347 44.558)"  }}}
+ The needScore local-param is optional but provides an optimization hint that should be used
for using the new spatial module in a filter query.  Notice that the query uses the standard
default Lucene query parser and uses its fielded-query syntax in which a field is referenced
followed by a colon.  The spatial operation and shape are provided in the double-quotes. 
Just use Intersects operation for now, as the other's aren't well supported.  The contents
of the parenthesis are a shape in the very same format used when indexing.
+ Keep in mind that the query shape will by default have a non-zero precision of 0.025 (2.5%),
calculated in the same way that distErrPct is on the field type declaration.  Here is an example
polygon query setting it to 0:
+ {{{	fq={!needScore=false}geo:"Intersects(POLYGON((-10 30, -40 40, -10 -20, 40 20, 0 0, -10
30))) distPrec=0" }}}
+ For skinny snake-like polygons, this is often desired.
+ The search results presently show the field value in "x y" order, but in the future it will
be "y,x" order for a geospatial context.  And also know that, at least for now, the point
detail reflects rounding to the maxDetailDist field type configuration parameter, so it won't
be precisely the same as that given on indexing.  This will probably be changed in the future.
  == Final Notes ==
+ This documentation is in-progress.  Distance sorting and relevancy boosting are not yet

View raw message