Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A3699D0AE for ; Tue, 11 Dec 2012 21:00:03 +0000 (UTC) Received: (qmail 59622 invoked by uid 500); 11 Dec 2012 21:00:00 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 59581 invoked by uid 500); 11 Dec 2012 21:00:00 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 59572 invoked by uid 99); 11 Dec 2012 21:00:00 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 11 Dec 2012 21:00:00 +0000 X-ASF-Spam-Status: No, hits=4.2 required=5.0 tests=HTML_MESSAGE,SPF_NEUTRAL,URI_HEX X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: 216.139.236.26 is neither permitted nor denied by domain of gbrits@gmail.com) Received: from [216.139.236.26] (HELO sam.nabble.com) (216.139.236.26) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 11 Dec 2012 20:59:53 +0000 Received: from ben.nabble.com ([192.168.236.152]) by sam.nabble.com with esmtp (Exim 4.72) (envelope-from ) id 1TiWvM-0002VE-T2 for solr-user@lucene.apache.org; Tue, 11 Dec 2012 12:59:32 -0800 Date: Tue, 11 Dec 2012 12:59:32 -0800 (PST) From: britske To: solr-user@lucene.apache.org Message-ID: In-Reply-To: <1355256055146-4026151.post@n3.nabble.com> References: <1355234403768-4026011.post@n3.nabble.com> <1355256055146-4026151.post@n3.nabble.com> Subject: Re: modeling prices based on daterange using multipoints MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_9729_18956854.1355259572887" X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_9729_18956854.1355259572887 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Hi David, Yeah interesting (as well as problematic as far is implementing) use-case indeed :) 1. You mention "there are no special caches / memory requirements inherent in this.". For a given user-query this would mean all hotels would have to seach for all point.x each time right? What would be a good plugin-point to build in some custom cached filter code for this (perhaps using the Solr Filter cache)? As I see it, determining all hotels that have a particular point.x value is probably: A) pretty costly to do on each user query. B). is static and can be cached easily without a lot of memory (relatively speaking) i.e: 20.000 filters (representing all of the 20.000 different point.x, that is, combos) with a bitset per filter representing ids of hotels that have the said point.x. 2. I'm not sure I explained C. (sorting) well, since I believe you're talking about implementing custom code to sort multiple point.y's per hotel, correct?. That's not what I need. Instead, for every user-query at most 1 point ever matches. I.e: a hotel has a price for a particular -combo (P.x) or it hasn't. Say a user queries for the -combo: <21 dec 2012,3 days,2 persons, double>. This might be encoded into a value, say: 12345. Now, for the hotels that do match that query (i.e: those hotels that have a point P for which P.x=12345) I want to sort those hotels on P.y (the price for the requested P.x) Geert-Jan 2012/12/11 David Smiley (@MITRE.org) [via Lucene] < ml-node+s472066n4026151h71@n3.nabble.com> > Hi Britske, > This is a very interesting question! > > britske wrote > ... > I realize the new spatial-stuff in Solr 4 is no magic bullet, but I'm > wondering if I could model multiple prices per day as multipoints, whereas: > > - date*duration*nr of persons*roomtype is modeled as point.x (discretized > in some 20.000 values) > - price modeled as point.y ( in dollarcents / normalized as avg price per > day: range: [0,200000] covering a max price of $2.000/day) > > The stuff that needs to be possible: > A) 1 required filter on point.x (filtering a 1 particular > combo. > B) an optional range query on point.y (min and./or max price filter) > C) optional soring on point.y (sorting on price (normal or reverse)) > > I'm pretty certain A) and B) won't be a problem as far is functionality is > concerned, but how about performance? I.e: would some sort of cached Solr > filter jump in for a given combo, > for quick doc-interesection, just as would with multiple dynamic fields in > my desribed as-is-case? > > A & B are indeed not a problem and there are no special caches / memory > requirements inherent in this. > > britske wrote > How about C)? Is sorting on point.y possible? (potenially in conjunction > with other sorting-fields used as tiebreaker, to give a stable sort? I > remember to have read that any filterquery can be used for sorting combined > with multipoints (which would make the above work I guess) but just would > like to confirm. > ... > > 'C' (sorting) is the challenge. As it stands, you will have to implement > a variation of this class: > http://svn.apache.org/viewvc/lucene/dev/branches/branch_4x/lucene/spatial/src/java/org/apache/lucene/spatial/util/ShapeFieldCacheDistanceValueSource.java?view=markup > Unlike this implementation, your implementation should ensure the point is > indeed in the query shape, and it should be configured to take the smallest > or largest 'y' as desired. Note that the cache infrastructure that this is > built on is flakey right now -- a memory hog in multiple ways. There will > be a Point implementation in memory for all of your indexed points, and an > ArrayList per doc. And it's not NRT search friendly, and doesn't > relinquish its resources (i.e. on commit) as quickly as it should. I know > what it's problems are but I have been quite busy. > > ~ David > Author: > http://www.packtpub.com/apache-solr-3-enterprise-search-server/book > > > ------------------------------ > If you reply to this email, your message will be added to the discussion > below: > > http://lucene.472066.n3.nabble.com/modeling-prices-based-on-daterange-using-multipoints-tp4026011p4026151.html > To unsubscribe from modeling prices based on daterange using multipoints, click > here > . > NAML > -- View this message in context: http://lucene.472066.n3.nabble.com/modeling-prices-based-on-daterange-using-multipoints-tp4026011p4026169.html Sent from the Solr - User mailing list archive at Nabble.com. ------=_Part_9729_18956854.1355259572887--