lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Smiley (@MITRE.org)" <DSMI...@mitre.org>
Subject RE: Modeling openinghours using multipoints
Date Mon, 10 Dec 2012 14:34:37 GMT
Maybe it would? I don't completely get your drift.  But you're talking about a user writing
a bunch of custom code to build, save, and query the bitmap whereas working on top of existing
functionality seems to me a lot more maintainable on the user's part.
~ David

________________________________
From: Lance Norskog-2 [via Lucene] [ml-node+s472066n4025579h39@n3.nabble.com]
Sent: Sunday, December 09, 2012 6:35 PM
To: Smiley, David W.
Subject: Re: Modeling openinghours using multipoints

If these are not raw times, but quantized on-the-hour, would it be
faster to create a bit map of hours and then query across the bit
maps?

On Sun, Dec 9, 2012 at 8:06 AM, Erick Erickson <[hidden email]<UrlBlockedError.aspx>>
wrote:

> Thanks for the discussion, I've added this to my bag of tricks, way cool!
>
> Erick
>
>
> On Sat, Dec 8, 2012 at 10:52 PM, britske <[hidden email]<UrlBlockedError.aspx>>
wrote:
>
>> Brilliant! Got some great ideas for this. Indeed all sorts of usecases
>> which use multiple temporal ranges could benefit..
>>
>> Eg: Another Guy on stackoverflow asked me about this some days ago.. He
>> wants to model multiple temporary offers per product (free shopping for
>> christmas, 20% discount for Black friday , etc) .. All possible with this
>> out of the box. Factor in 'offer category' in  x and y as well for some
>> extra powerfull querying.
>>
>> Yup im enthousiastic about it , which im sure you can tell :)
>>
>> Thanks a lot David,
>>
>> Cheers,
>> Geert-Jan
>>
>>
>>
>> Sent from my iPhone
>>
>> On 9 dec. 2012, at 05:35, "David Smiley (@MITRE.org) [via Lucene]" <
>> [hidden email]<UrlBlockedError.aspx>> wrote:
>>
>> > britske wrote
>> > That's seriously awesome!
>> >
>> > Some change in the query though:
>> > You described: "To query for a business that is open during at least some
>> > part of a given time duration"
>> > I want "To query for a business that is open during at least the entire
>> > given time duration".
>> >
>> > Feels like a small difference but probably isn't (I'm still wrapping my
>> > head on the intersect query I must admit)
>> > So this would be a slightly different rectangle query.  Interestingly,
>> you simply swap the location in the rectangle where you put the start and
>> end time.  In summary:
>> >
>> > Indexed span CONTAINS query span:
>> > minX minY maxX maxY -> 0 end start *
>> >
>> > Indexed span INTERSECTS (i.e. OVERLAPS) query span:
>> > minX minY maxX maxY -> 0 start end *
>> >
>> > Indexed span WITHIN query span:
>> > minX minY maxX maxY -> start 0 * end
>> >
>> > I'm using '*' here to denote the max possible value.  At some point I
>> may add that as a feature.
>> >
>> > That was a fun exercise!  I give you credit in prodding me in this
>> direction as I'm not sure if this use of spatial would have occurred to me
>> otherwise.
>> >
>> > britske wrote
>> > Moreover, any indication on performance? Should, say, 50.000 docs with
>> > about 100-200 points each (1 a 2 open-close spans per day) be ok? ( I
>> know
>> > 'your mileage may very' etc. but just a guestimate :)
>> > You should have absolutely no problem.  The real clincher in your favor
>> is the fact that you only need 9600 discrete time values (so you said), not
>> Long.MAX_VALUE.  Using Long.MAX_VALUE would simply not be possible with the
>> current implementation because it's using Doubles which has 52 bits of
>> precision not the 64 that would be required to be a complete substitute for
>> any time/date.  Even given the 52 bits, a quad SpatialPrefixTree with
>> maxLevels="52" would probably not perform well or might fail; not sure.
>>  Eventually when I have time to work on an implementation that can be based
>> on a configurable number of grid cells (not unlike how you can configure
>> precisionStep on the Trie numeric fields), 52 should be no problem.
>> >
>> > I'll have to remember to refer back to this email on the approach if I
>> create a field type that wraps this functionality.
>> >
>> > ~ David
>> >
>> > britske wrote
>> > Again, this looks good!
>> > Geert-Jan
>> >
>> > 2012/12/8 David Smiley (@MITRE.org) [via Lucene] <
>> > [hidden email]>
>> >
>> > > Hello again Geert-Jan!
>> > >
>> > > What you're trying to do is indeed possible with Solr 4 out of the box.
>> > >  Other terminology people use for this is multi-value time duration.
>>  This
>> > > creative solution is a pure application of spatial without the
>> geospatial
>> > > notion -- we're not using an earth or other sphere model -- it's a flat
>> > > plane.  So no need to make reference to longitude & latitude, it's
x &
>> y.
>> > >
>> > > I would put opening time into x, and closing time into y.  To express a
>> > > point, use "x y" (x space y), and supply this as a string to your
>> > > SpatialRecursivePrefixTreeFieldType based field for indexing.  You can
>> give
>> > > it multiple values and it will work correctly; this is one of RPT's
>> main
>> > > features that set it apart from Solr 3 spatial.  To query for a
>> business
>> > > that is open during at least some part of a given time duration, say
>> 6-8
>> > > o'clock, the query would look like openDuration:"Intersects(minX minY
>> maxX
>> > > maxY)"  and put 0 or minX (always), 6 for minY (start time), 8 for maxX
>> > > (end time), and the largest possible value for maxY.  You wouldn't
>> actually
>> > > use 6 & 8, you'd use the number of 15 minute intervals since your
>> epoch for
>> > > this equivalent time span.
>> > >
>> > > You'll need to configure the field correctly: geo="false"
>> worldBounds="0 0
>> > > maxTime maxTime" substituting an appropriate value for maxTime based on
>> > > your unit of time (number of 15 minute intervals you need) and
>> > > distErrPct="0" (full precision).
>> > >
>> > > Let me know how this works for you.
>> > >
>> > > ~ David
>> > >  Author:
>> > > http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
>> >  Author:
>> http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
>> >
>> >
>> > If you reply to this email, your message will be added to the discussion
>> below:
>> >
>> http://lucene.472066.n3.nabble.com/Modeling-openinghours-using-multipoints-tp4025336p4025434.html
>> > To unsubscribe from Modeling openinghours using multipoints, click here.
>> > NAML
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Modeling-openinghours-using-multipoints-tp4025336p4025454.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>



--
Lance Norskog
[hidden email]<UrlBlockedError.aspx>


________________________________
If you reply to this email, your message will be added to the discussion below:
http://lucene.472066.n3.nabble.com/Modeling-openinghours-using-multipoints-tp4025336p4025579.html
To unsubscribe from Modeling openinghours using multipoints, click here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4025336&code=RFNNSUxFWUBtaXRyZS5vcmd8NDAyNTMzNnwxMDE2NDI2OTUw>.
NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>




-----
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: http://lucene.472066.n3.nabble.com/Modeling-openinghours-using-multipoints-tp4025336p4025683.html
Sent from the Solr - User mailing list archive at Nabble.com.
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message