lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Huang <metapr...@yahoo.com>
Subject Re: Sort by relevance+distance
Date Sun, 18 Sep 2005 16:24:28 GMT
See comments below.

--- Erik Hatcher <erik@ehatchersolutions.com> wrote:

> [trimming the post a bit]
> 
> On Sep 18, 2005, at 11:51 AM, James Huang wrote:
> > The problem is quite generic, I believe. What I
> like
> > to do is similar to LIA-ch6, i.e. to find a "good
> > Chinese Hunan-style restaurant near me." I prefer
> > Hunan-style; however, if a good Human-style one is
> 12
> > miles, where there is a Shanghai-style only 2
> miles, I
> > may want to take that instead. So it's not a
> simple
> > multi-sorting problem, it's an empirical ordering
> and
> > the parameters may have to be experimented. Thus
> far,
> > I'm happy with that formula I gave earlier.
> 
> The example in LIA was purely a distance sort, not
> blended as you  
> desire.
> 
> > Separately, earlier in this thread, you also
> mentioned
> > "what if 10M search results?" -- that's also my
> > concern, for both space and time.
> >
> > 1. Space-wise, the 10M Document's will be dragged
> into
> > memory (in a Hits, say), right?
> 
> No, that is not correct, and this is an important
> point about Lucene  
> and it's ability to scale extremely well.  Hits
> caches up to 200  
> documents (I believe) and uses a mechanism to score
> single documents  
> at a time and only keep the top scoring ones.
> 
> There is no problem for Lucene to search and have
> Hits with a massive size.
> 
> There are memory considerations with sorting, though
> - these are  
> described in detail in the javadocs and a little in
> LIA.
> 
> > 1. How to use a compound scoring at search-time
> (where
> > you suggested a Query-subclass, but what/how?)
> 
> I'm going to defer to others to assist with this, or
> validate that  
> this is the right approach in this situation.
> 
> > 2. Space concern about large search result set.
> 
> With a Query subclass, this shouldn't be a concern. 
> With sorting  
> using Lucene's Sort there are some memory concerns,
> but less so than with your own TreeSet.
> 

OK, so external sorting does not scale and has to be
ruled out!

Now I have to find a way to customize the scoring
during search (using Hits, not customized
HitsCollector). Help is desparately needed here!

Thanks in advance,
-James

> 
>      Erik
> 


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message