lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikhail Khludnev <>
Subject Re: [jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes
Date Thu, 08 Dec 2011 18:10:26 GMT

Have you tried to do a basic profiling or sampling? Just take a few thread
dumps by jstack. If the code is so greedy for CPU, you'll have it in a


On Thu, Dec 8, 2011 at 8:57 PM, Srikanth Kallurkar (Commented) (JIRA) <> wrote:

>    [
> Srikanth Kallurkar commented on SOLR-2155:
> ------------------------------------------
> In my use case, I have a large number of lat-lons for each document - on
> the order of about 2K lat-lon pairs. Since the time we started using
> geohash prefix filter, the time to index has significantly degraded - by
> about 2-3 times. Are there any suggestions for speeding up the indexing
> process. I was trying to read the comments here, but am not sure if any
> index time caching mechanism is used (or could be used) to lookup geohashes.
> Thanks,
> Srikanth
> > Geospatial search using geohash prefixes
> > ----------------------------------------
> >
> >                 Key: SOLR-2155
> >                 URL:
> >             Project: Solr
> >          Issue Type: Improvement
> >            Reporter: David Smiley
> >         Attachments: GeoHashPrefixFilter.patch,
> GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch,
> SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch,
> SOLR.2155.p3.patch, SOLR.2155.p3tests.patch,,
>, Solr2155-for-1.0.2-3.x-port.patch
> >
> >
> > There currently isn't a solution in Solr for doing geospatial filtering
> on documents that have a variable number of points.  This scenario occurs
> when there is location extraction (i.e. via a "gazateer") occurring on free
> text.  None, one, or many geospatial locations might be extracted from any
> given document and users want to limit their search results to those
> occurring in a user-specified area.
> > I've implemented this by furthering the GeoHash based work in
> Lucene/Solr with a geohash prefix based filter.  A geohash refers to a
> lat-lon box on the earth.  Each successive character added further
> subdivides the box into a 4x8 (or 8x4 depending on the even/odd length of
> the geohash) grid.  The first step in this scheme is figuring out which
> geohash grid squares cover the user's search query.  I've added various
> extra methods to GeoHashUtils (and added tests) to assist in this purpose.
>  The next step is an actual Lucene Filter, GeoHashPrefixFilter, that uses
> these geohash prefixes in to skip to relevant grid squares
> in the index.  Once a matching geohash grid is found, the points therein
> are compared against the user's query to see if it matches.  I created an
> abstraction GeoShape extended by subclasses named PointDistance... and
> CartesianBox.... to support different queried shapes so that the filter
> need not care about these details.
> > This work was presented at LuceneRevolution in Boston on October 8th.
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA
> administrators:
> For more information on JIRA, see:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Sincerely yours
Mikhail Khludnev
Grid Dynamics
tel. 1-415-738-8644
Skype: mkhludnev

View raw message