lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Smiley <david.w.smi...@gmail.com>
Subject Re: Impact/Performance of maxDistErr
Date Wed, 30 May 2018 12:43:25 GMT
I suggest using the "Intersects" spatial predicate when either the data is
all points or if the query is a point.  It's semantically equivalent and
the algorithm is much faster.

On Wed, May 30, 2018 at 3:25 AM Jens Viebig <jens.viebig@vitec.com> wrote:

> Thanks for the detailed answer David, that helps a lot to understand!
> Best Regards
>
> Jens
>
> P.S. Currently the only search we are doing on the polygon is
> Contains(POINT(x,y))
>
>
> Am 29.05.2018 um 13:30 schrieb David Smiley:
>
> Hello Jens,
> With solr.RptWithGeometrySpatialField, you always get an accurate result
> thanks to the "WithGeometry" part.  The "Rpt" part is a grid index, and
> most of the parameters pertain to that.  maxDistErr controls the highest
> resolution grid.  No shape will be indexed to higher resolutions than this,
> though may be courser resolutions dependent on distErrPct.  The
> configuration you chose initially (that turned out to be slow for you) was
> a meter, and then you changed it to a kilometer and got fast indexing
> results.  I figure the size of your indexed shapes are on average a
> kilometer in size (give or take an order of magnitude).  It's hard to guess
> how your query shapes compare to your indexed shapes as there are multiple
> possibilities that could yield similar query performance when changing
> maxDistErr so much.
>
> The bottom line is that you should dial up maxDistErr as much as you can
> get away with it -- which is as long as query performance is good.  So you
> did the right thing :-).  That number will probably be a distance somewhat
> less than the average indexed shape diameter, or average query shape
> diameter, whichever is greater.  Perhaps 1/10th smaller; if I had to pick.
> The default setting, I think a meter, is probably not a good default for
> this field type.
>
> Note you could also try increasing distErrPct some, maybe to as much as
> .25, though I wouldn't go much higher., as it may yield gridded shapes that
> are so course as to not have interior cells.  Depending on what your query
> shapes typically look like and indexed shapes relative to each other, that
> may be significant or may not be.  If the indexed shapes are often much
> larger than your query shape then it's significant.
>
> ~ David
>
> On Fri, May 25, 2018 at 6:59 AM Jens Viebig <jens.viebig@vitec.com> wrote:
>
>> Hello,
>>
>> we are indexing a polygon with 4 points (non-rectangular, field-of-view
>> of a camera) in a RptWithGeometrySpatialField alongside some more fields,
>> to perform searches that check if a point is within this polygon
>>
>> We started using the default configuration found in several examples
>> online:
>>
>> <fieldType name="location_grpt" class="solr.RptWithGeometrySpatialField"
>>
>> spatialContextFactory="com.spatial4j.core.context.jts.JtsSpatialContextFactory"
>>            geo="true" distErrPct="0.15" maxDistErr="0.001"
>> distanceUnits="kilometers" />
>>
>> We discovered that with this setting the indexing (soft commit) speed is
>> very slow
>> For 10000 documents it takes several minutes to finish the commit
>>
>> If we disable this field, indexing+soft commit is only 3 seconds for
>> 10000 docs,
>> if we set maxDistErr to 1, indexing speed is at around 5 seconds, so a
>> huge performance gain against the several minutes we had before
>>
>> I tried to find out via the documentation whats the impact of
>> "maxDistErr" on search results but didn't quite find an in-depth explanation
>> From our tests we did, the search results still seem to be very accurate
>> even if the covered space of the polygon is less then 1km and search speed
>> did not suffer.
>>
>> So i would love to learn more about the differences on having
>> maxDistErr="0.001" vs maxDistErr="1" on a RptWithGeometrySpatialField and
>> what problems could we run into with the bigger value
>>
>> Thanks
>> Jens
>>
>>
>>
>>
>> *Jens Viebig*
>>
>> Software Development
>>
>> MAM Products
>>
>>
>> T. +49-(0)4307-8358-0 <+49%204307%2083580>
>>
>> E. jens.viebig@vitec.com
>>
>> *http://www.vitec.com <http://www.vitec.com>*
>>
>>
>>
>> [image: VITEC_logo_for_email_signature]
>>
>>
>>
>> --
>>
>> VITEC GmbH, 24223 Schwentinental
>>
>> Geschäftsführer/Managing Director: Philippe Wetzel
>> HRB Plön 1584 / Steuernummer: 1929705211 / VATnumber: DE134878603
>>
>>
>>
> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com
>
>
> --
>
>
> *Jens Viebig*
>
> Software Development
>
> MAM Products
>
>
> T. +49-(0)4307-8358-0 <+49%204307%2083580>
>
> E. jens.viebig@vitec.com
>
> *http://www.vitec.com <http://www.vitec.com>*
>
>
>
> [image: VITEC_logo_for_email_signature]
>
>
>
> --
>
> VITEC GmbH, 24223 Schwentinental
>
> Geschäftsführer/Managing Director: Philippe Wetzel
> HRB Plön 1584 / Steuernummer: 1929705211 / VATnumber: DE134878603
>
>
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com

Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message