lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "david.w.smiley@gmail.com" <david.w.smi...@gmail.com>
Subject Re: Lucene Spatial Implementation for Points within Polygon.
Date Mon, 22 Dec 2014 13:49:19 GMT
Hello.

You have stated the use-case so generically that it’s not clear if you
should index the polygon set and query by the point set, or the reverse.
Generally, you should index the set that is known in-advance and then query
by the other, the set that is generally not known.  Assuming this is the
case, index the stable set with RecursivePrefixTreeStrategy, *and*, for
accuracy, if that set is also the polygon set, use SerializedDVStrategy
*or* simply keep them all in-memory keyed by an identifier (call
JtsGeometry.index() on each as well) that you check against at runtime.  If
you don’t have enough RAM then you’ll do the former.  If neither set seems
to be “stable”, you could really index either, definitely choose to index
the points.  The predicate you should use is INTERSECTS; the others are
intended for polygon against polygons (basically any non-point shape
against another non-point shape).

If your scenario is quite simply, you have a bunch of points and polygons
you get all at once to make this computation and then that’s it (no
long-term need to query again by the same polygons or points in the
future), I suggest using JTS directly in-memory, and its PreparedGeometry
to optimize each polygons, then iterate through your points to see which
polygons they are in.  You might even use JTS's STRtree to index polygon
bounding boxes to avoid looping over all polygons.

~ David Smiley
Freelance Apache Lucene/Solr Search Consultant/Developer
http://www.linkedin.com/in/davidwsmiley

On Mon, Dec 22, 2014 at 12:30 AM, <Ankit.Murarka@ril.com> wrote:
>
> Hello Team,
>
> We are starting off with Lucene Spatial implementation for some of the use
> cases:
>
> A . Given "N" polygons and "M" points, find how many points lie inside
> each of the polygon.
>
> 1st Approach :
>
> For A, we indexed Polygons using WKT and using JtsSpatial strategy. I set
> the Level at 22 . This has resulted in huge number of terms. This was
> needed as I need the search to be near perfect.
>
> For Indexing, I used Point(Supplied as WKT) using Jts again with Level at
> 22 (Although I think specifying level at query time does not make much
> difference).
>
> For this, we used ""CONTAINS" .  Output is coming but I am not sure if I
> am doing it the right way. Need suggestion.
>
> I am having following confusion:
>
> a.       Will CONTAINS and IS WITHIN both work in the same way for the
> given scenario. I am ruling OUT INTERSECTS as that scenario is not
> appropriate.
>
> b.      Second, are we missing something  in getting the correct output.
>
>
> 2nd Approach : (Reversed)
>
> Indexed POINTS in WKT format.
> Passed Polygons in WKT using JTs as query and fired as INTERSECTS and
> WITHIN.
>
> In second approach, we are getting more output than the 1st approach.
>
> However, we are still not sure which is the best way to tackle this
> problem. Please suggest.
>
> "Confidentiality Warning: This message and any attachments are intended
> only for the use of the intended recipient(s).
> are confidential and may be privileged. If you are not the intended
> recipient. you are hereby notified that any
> review. re-transmission. conversion to hard copy. copying. circulation or
> other use of this message and any attachments is
> strictly prohibited. If you are not the intended recipient. please notify
> the sender immediately by return email.
> and delete this message and any attachments from your system.
>
> Virus Warning: Although the company has taken reasonable precautions to
> ensure no viruses are present in this email.
> The company cannot accept responsibility for any loss or damage arising
> from the use of this email or attachment."
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message