lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From William Bell <>
Subject Re: custom ValueSource for decoding geohash into lat & lon
Date Thu, 10 Mar 2011 23:21:23 GMT
OK. But I am concerned that you are trying to bite off more than can
be done easily. The sample call is:


Notice that geomultidist() needs another field called storemv right
now that is bar delimited. I tried to pull out the lat,long from
geohash, but Dave stores the geohash values in Ngram for the purpose
of filtering (I believe).

 Here are the issues as I see them:

1. ValueSources does not support MultiValue fields. The
would be extended or solr.GeoHashField would be extended to support
this. The way the spatial stuff works is that the Lat, Long is created
as separate fields and then the fc can get the matches values from the
cache. I think this is totally sub-optimal. There should be a generic
ValueSource that works on MultiValue fields directly and generically.
It would be extremely useful if this all happened behind the scenes:

<field name="store_lat_lon" type="geohash" indexed="true"
stored="true" multiValues="true" />

My code is dependent on the following to do distance quickly (using
<arr name="storemv">
<str> 39.90923,-86.19389|42.37577,-72.50858</str>

storing: 39.90923,-86.19389 and 42.37577,-72.50858

Internally it could be stored as (and converted using the type):

store_lat_long_lat_1 = 39.90923
store_lat_long_lon_1 = -86.19389
store_lat_long_lon_2 = 42.37577
store_lat_long_lon_2 = -72.50858

2. Using ValueSource with one value is fast, and splitting it this way
might be a lot slower to calculate distances. It is convenient, but
could be slow. It might be better to just have solr.GeoHashField
append to the interanal field so that it can use ValueSource directly.

Use an internal field that uses bars internally:

store_lat_long_bar =  39.90923,-86.19389|42.37577,-72.50858

For each lat,long value
    - Calculate geohash and Ngram store
    - Append to the internal field "store_lat_long_bar" based on the field name

Option 2 is easier and makes it supportable now without waiting for
redesign of ValueSource.

On Thu, Mar 10, 2011 at 2:16 PM, Smiley, David W. <> wrote:
> I'm looking for validation of my approach to geospatial sorting from committers.
> I'm starting work on implementing sorting for my geohash based filter code in
 The existing GeohashHaversineFunction uses ValueSources based on the the natural string
value in the index, StrFieldSource, and it decodes them each pass through.  This is obviously
sub-optimal.  So I think a remedy is to implement my own ValueSource extending MultiValueSource
that will decode the geohash into a pair of doubles on initialization.  It would do this
using a CachedArrayCreator implementation of my design.  I don't think I can/should use VectorValueSource
since that one is predicated on being composed of multiple other value sources which is not
my scenario.  Unfortunately my proposed ValueSource subclass cannot simultaneously subclass
both MultiValueSource and FieldCacheSource but the latter doesn't appear to really be necessary.
 Actually I'm surprised MultiValueSource isn't an interface since it only has an abstract
> Another aspect to this problem is that geohashes support multiple points per document.
 I intend to subclass DocValues() with a method that will return an array of simple objects
holding the pair.  If someone has hints as to some issues/problems with this approach then
please let me know.
> Bill Bell, if you're reading this, I know you did a patch attached to SOLR-2155 for sorting
but it uses separate fields to hold the lat & lon for sorting and I'm trying to fix this.
> ~ David Smiley
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message