lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Smiley (@MITRE.org)" <DSMI...@mitre.org>
Subject Re: Modeling openinghours using multipoints
Date Sun, 09 Dec 2012 04:35:22 GMT
britske wrote
> That's seriously awesome!
> 
> Some change in the query though:
> You described: "To query for a business that is open during at least some
> part of a given time duration"
> I want "To query for a business that is open during at least the entire
> given time duration".
> 
> Feels like a small difference but probably isn't (I'm still wrapping my
> head on the intersect query I must admit)

So this would be a slightly different rectangle query.  Interestingly, you
simply swap the location in the rectangle where you put the start and end
time.  In summary:

Indexed span CONTAINS query span:
minX minY maxX maxY -> 0 end start *

Indexed span INTERSECTS (i.e. OVERLAPS) query span:
minX minY maxX maxY -> 0 start end *

Indexed span WITHIN query span:
minX minY maxX maxY -> start 0 * end

I'm using '*' here to denote the max possible value.  At some point I may
add that as a feature.

That was a fun exercise!  I give you credit in prodding me in this direction
as I'm not sure if this use of spatial would have occurred to me otherwise.


britske wrote
> Moreover, any indication on performance? Should, say, 50.000 docs with
> about 100-200 points each (1 a 2 open-close spans per day) be ok? ( I know
> 'your mileage may very' etc. but just a guestimate :)

You should have absolutely no problem.  The real clincher in your favor is
the fact that you only need 9600 discrete time values (so you said), not
Long.MAX_VALUE.  Using Long.MAX_VALUE would simply not be possible with the
current implementation because it's using Doubles which has 52 bits of
precision not the 64 that would be required to be a complete substitute for
any time/date.  Even given the 52 bits, a quad SpatialPrefixTree with
maxLevels="52" would probably not perform well or might fail; not sure. 
Eventually when I have time to work on an implementation that can be based
on a configurable number of grid cells (not unlike how you can configure
precisionStep on the Trie numeric fields), 52 should be no problem.

I'll have to remember to refer back to this email on the approach if I
create a field type that wraps this functionality.

~ David


britske wrote
> Again, this looks good!
> Geert-Jan
> 
> 2012/12/8 David Smiley (@MITRE.org) [via Lucene] <

> ml-node+s472066n4025359h19@.nabble

>>
> 
>> Hello again Geert-Jan!
>>
>> What you're trying to do is indeed possible with Solr 4 out of the box.
>>  Other terminology people use for this is multi-value time duration. 
>> This
>> creative solution is a pure application of spatial without the geospatial
>> notion -- we're not using an earth or other sphere model -- it's a flat
>> plane.  So no need to make reference to longitude & latitude, it's x & y.
>>
>> I would put opening time into x, and closing time into y.  To express a
>> point, use "x y" (x space y), and supply this as a string to your
>> SpatialRecursivePrefixTreeFieldType based field for indexing.  You can
>> give
>> it multiple values and it will work correctly; this is one of RPT's main
>> features that set it apart from Solr 3 spatial.  To query for a business
>> that is open during at least some part of a given time duration, say 6-8
>> o'clock, the query would look like openDuration:"Intersects(minX minY
>> maxX
>> maxY)"  and put 0 or minX (always), 6 for minY (start time), 8 for maxX
>> (end time), and the largest possible value for maxY.  You wouldn't
>> actually
>> use 6 & 8, you'd use the number of 15 minute intervals since your epoch
>> for
>> this equivalent time span.
>>
>> You'll need to configure the field correctly: geo="false" worldBounds="0
>> 0
>> maxTime maxTime" substituting an appropriate value for maxTime based on
>> your unit of time (number of 15 minute intervals you need) and
>> distErrPct="0" (full precision).
>>
>> Let me know how this works for you.
>>
>> ~ David
>>  Author:
>> http://www.packtpub.com/apache-solr-3-enterprise-search-server/book





-----
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: http://lucene.472066.n3.nabble.com/Modeling-openinghours-using-multipoints-tp4025336p4025434.html
Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message