lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "david.w.smiley@gmail.com" <david.w.smi...@gmail.com>
Subject Re: range types in SOLR
Date Tue, 06 May 2014 18:26:26 GMT
Hi Era,

I appreciate the scattered documentation is confusing for users.  The use
of spatial for time durations is definitely not an official way to do it;
it’s clearly a hack/trick — one that works pretty well if you know the
issues to watch out for.  So I don’t see it getting documented on the
reference guide.  But, you should be happy to know about this:
https://issues.apache.org/jira/browse/LUCENE-5648  “Watch” that issue to
stay abreast of my development on it, and the inevitable Solr FieldType to
follow, and inevitable documentation in the reference guide.  With luck
it’ll get in by 4.9.

The “Intersects(POLYGON(…))” syntax is something I suggest using when you
have to — like when you have a polygon or linestring or if you are indexing
circles.  One of these days there will be a more Solr friendly query parser
— definitely for 4.something.  When that happens, it’ll get
deprecated/removed in trunk/5.

~ David

On Tue, May 6, 2014 at 4:22 AM, Ere Maijala <ere.maijala@helsinki.fi> wrote:

> David,
>
> I made a note about your mentioning the deprecation below to take it into
> account in our software, but now that I tried to find out more about this I
> ran into some confusion since the Solr documentation regarding spatial
> searches is currently quite badly scattered and partly obsolete [1]. I'd
> appreciate some clarification on what exactly is deprecated. We're
> currently using spatial for both time duration and geographic searches, and
> in the latter we also use e.g. Intersects(POLYGON(...)) in addition. Is
> this also deprecated and if so, how should I rewrite it? Thanks!
>
> --Ere
>
> [1] It would be really nice if it was possible to find up to date
> documentation of at least all this in one place:
>
> https://cwiki.apache.org/confluence/display/solr/Spatial+Search
> https://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4
> http://wiki.apache.org/solr/SpatialForTimeDurations
> https://people.apache.org/~hossman/spatial-for-non-
> spatial-meetup-20130117/
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/
> 201212.mbox/%3C1355027722156-4025434.post@n3.nabble.com%3E
>
> 3.3.2014 20.12, Smiley, David W. kirjoitti:
>
>> The main reference for this approach is here:
>> http://wiki.apache.org/solr/SpatialForTimeDurations
>>
>>
>> Hoss’s illustrations he developed for the meetup presentation are great.
>> However, there are bugs in the instruction — specifically it’s important
>> to slightly buffer the query and choose an appropriate maxDistErr.  Also,
>> it’s more preferable to use the rectangle range query style of spatial
>> query (e.g. field:[“minX minY” TO “maxX maxY”] as opposed to using
>> “Intersects(minX minY maxX maxY)”.  There’s no technical difference but
>> the latter is deprecated and will eventually be removed from Solr 5 /
>> trunk.
>>
>> All this said, recognize this is a bit of a hack (one that works well).
>> There is a good chance a more ideal implementation approach is going to be
>> developed this year.
>>
>> ~ David
>>
>>
>> On 3/1/14, 2:54 PM, "Shawn Heisey" <solr@elyograg.org> wrote:
>>
>>  On 3/1/2014 11:41 AM, Thomas Scheffler wrote:
>>>
>>>> Am 01.03.14 18:24, schrieb Erick Erickson:
>>>>
>>>>> I'm not clear what you're really after here.
>>>>>
>>>>> Solr certainly supports ranges, things like time:[* TO date_spec] or
>>>>> date_field:[date_spec TO date_spec] etc.
>>>>>
>>>>>
>>>>> There's also a really creative use of spatial (of all things) to, say
>>>>> answer questions involving multiple dates per record. Imagine, for
>>>>> instance, employees with different hours on different days. You can
>>>>> use spatial to answer questions like "which employees are available
>>>>> on Wednesday between 4PM and 8PM".
>>>>>
>>>>> And if none of this is relevant, how about you give us some
>>>>> use-cases? This could well be an XY problem.
>>>>>
>>>>
>>>> Hi,
>>>>
>>>> lets try this example to show the problem. You have some old text that
>>>> was written in two periods of time:
>>>>
>>>> 1.) 2nd half of 13th century: -> 1250-1299
>>>> 2.) Beginning of 18th century: -> 1700-1715
>>>>
>>>> You are searching for text that were written between 1300-1699, than
>>>> this document described above should not be hit.
>>>>
>>>> If you make start date and end date multiple this results in:
>>>>
>>>> start: [1250, 1700]
>>>> end: [1299, 1715]
>>>>
>>>> A search for documents written between 1300-1699 would be:
>>>>
>>>> (+start:[1300 TO 1699] +end:[1300-1699]) (+start:[* TO 1300] +end:[1300
>>>> TO *]) (+start:[*-1699] +end:[1700 TO *])
>>>>
>>>> You see that the document above would obviously hit by "(+start:[* TO
>>>> 1300] +end:[1300 TO *])"
>>>>
>>>
>>> This sounds exactly like the spatial use case that Erick just described.
>>>
>>> http://wiki.apache.org/solr/SpatialForTimeDurations
>>> https://people.apache.org/~hossman/spatial-for-non-
>>> spatial-meetup-20130117
>>> /
>>>
>>> I am not sure whether the following presentation covers time series with
>>> spatial, but it does say deep dive.  It's over an hour long, and done by
>>> David Smiley, who wrote most of the Spatial code in Solr:
>>>
>>> http://www.lucenerevolution.org/2013/Lucene-Solr4-Spatial-Deep-Dive
>>>
>>> Hopefully someone who has actually used this can hop in and give you
>>> some additional pointers.
>>>
>>> Thanks,
>>> Shawn
>>>
>>>
>>
>
> --
> Ere Maijala
> Kansalliskirjasto / The National Library of Finland
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message