lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-5779) Improve BBox AreaSimilarity algorithm to consider lines and points
Date Tue, 08 Jul 2014 14:17:04 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-5779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14054970#comment-14054970
] 

ASF subversion and git services commented on LUCENE-5779:
---------------------------------------------------------

Commit 1608793 from [~dsmiley] in branch 'dev/trunk'
[ https://svn.apache.org/r1608793 ]

LUCENE-5714, LUCENE-5779: Enhance BBoxStrategy & Overlap similarity. Configurable docValues
/ index usage.
Add new ShapeAreaValueSource.

> Improve BBox AreaSimilarity algorithm to consider lines and points
> ------------------------------------------------------------------
>
>                 Key: LUCENE-5779
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5779
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/spatial
>            Reporter: David Smiley
>         Attachments: LUCENE-5779__Improved_bbox_AreaSimilarity_algorithm.patch
>
>
> GeoPortal's area overlap algorithm didn't consider lines and points; they end up turning
the score 0.  I've thought about this for a bit and I've come up with an alternative scoring
algorithm.  (already coded and tested and documented):
> New Javadocs:
> {code:java}
> /**
>  * The algorithm is implemented as envelope on envelope overlays rather than
>  * complex polygon on complex polygon overlays.
>  * <p/>
>  * <p/>
>  * Spatial relevance scoring algorithm:
>  * <DL>
>  *   <DT>queryArea</DT> <DD>the area of the input query envelope</DD>
>  *   <DT>targetArea</DT> <DD>the area of the target envelope (per Lucene
document)</DD>
>  *   <DT>intersectionArea</DT> <DD>the area of the intersection between
the query and target envelopes</DD>
>  *   <DT>queryTargetProportion</DT> <DD>A 0-1 factor that divides the
score proportion between query and target.
>  *   0.5 is evenly.</DD>
>  *
>  *   <DT>queryRatio</DT> <DD>intersectionArea / queryArea; (see note)</DD>
>  *   <DT>targetRatio</DT> <DD>intersectionArea / targetArea; (see note)</DD>
>  *   <DT>queryFactor</DT> <DD>queryRatio * queryTargetProportion;</DD>
>  *   <DT>targetFactor</DT> <DD>targetRatio * (1 - queryTargetProportion);</DD>
>  *   <DT>score</DT> <DD>queryFactor + targetFactor;</DD>
>  * </DL>
>  * Note: The actual computation of queryRatio and targetRatio is more complicated so
that it considers
>  * points and lines. Lines have the ratio of overlap, and points are either 1.0 or 0.0
depending on wether
>  * it intersects or not.
>  * <p />
>  * Based on Geoportal's
>  * <a href="http://geoportal.svn.sourceforge.net/svnroot/geoportal/Geoportal/trunk/src/com/esri/gpt/catalog/lucene/SpatialRankingValueSource.java">
>  *   SpatialRankingValueSource</a> but modified. GeoPortal's algorithm will yield
a score of 0
>  * if either a line or point is compared, and it's doesn't output a 0-1 normalized score
(it multiplies the factors).
>  *
>  * @lucene.experimental
>  */
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message