lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Male (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4208) Spatial distance relevancy should use score of 1/distance
Date Sat, 28 Jul 2012 14:39:34 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13424358#comment-13424358
] 

Chris Male commented on LUCENE-4208:
------------------------------------

Having thought about this more I think the best way forward is to just emulate free-text queries
and have a {{SpatialSimilarity}} abstraction.  I'm not sure of the exact nature of the API
for this but I think there are times with using 1/x is sufficient and there are probably times
when a more convoluted algorithm fits.  We should allow the consumer to control what they
choose.  

I think the Similarity should be given the Query Shape, the matched docID and the current
SpatialOperation as a minimum.  I'd like to somehow see a way to also pass in a pre-computed
distance (for Queries that compute it as part of their matching) and possibly the matched
grid hash for anything using the PrefixTrees.  We might have to have subclasses for those,
or maybe a Command or something, I'm not sure.

Other benefits:
- We immediately open up the ability to have more complex similarity scores based on overlap
percentage or anything really.
- It is plausible that a SpatialSimilarity might use a cache of indexed Shapes to facilitate
more complex algorithms.  By having this abstraction we offload the caching from the main
API.
- It is also plausible that a SpatialSimilarity instance could be misused to cache calculated
distances if the consumer so wanted.

I think we should consider whether we want SpatialSimilarities to also be given the current
IndexReader (and so be able to use it in any caches or other lookups) or whether we want them
to be IR independent.

We will also need some custom Queries to actually make use of the SpatialSimilarity.  Need
to think this one through a little.


                
> Spatial distance relevancy should use score of 1/distance
> ---------------------------------------------------------
>
>                 Key: LUCENE-4208
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4208
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/spatial
>            Reporter: David Smiley
>             Fix For: 4.0
>
>
> The SpatialStrategy.makeQuery() at the moment uses the distance as the score (although
some strategies -- TwoDoubles if I recall might not do anything which would be a bug).  The
distance is a poor value to use as the score because the score should be related to relevancy,
and the distance itself is inversely related to that.  A score of 1/distance would be nice.
 Another alternative is earthCircumference/2 - distance, although I like 1/distance better.
 Maybe use a different constant than 1.
> Credit: this is Chris Male's idea.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message