lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Sort by relevance+distance
Date Sun, 18 Sep 2005 16:04:56 GMT
[trimming the post a bit]

On Sep 18, 2005, at 11:51 AM, James Huang wrote:
> The problem is quite generic, I believe. What I like
> to do is similar to LIA-ch6, i.e. to find a "good
> Chinese Hunan-style restaurant near me." I prefer
> Hunan-style; however, if a good Human-style one is 12
> miles, where there is a Shanghai-style only 2 miles, I
> may want to take that instead. So it's not a simple
> multi-sorting problem, it's an empirical ordering and
> the parameters may have to be experimented. Thus far,
> I'm happy with that formula I gave earlier.

The example in LIA was purely a distance sort, not blended as you  
desire.

> Separately, earlier in this thread, you also mentioned
> "what if 10M search results?" -- that's also my
> concern, for both space and time.
>
> 1. Space-wise, the 10M Document's will be dragged into
> memory (in a Hits, say), right?

No, that is not correct, and this is an important point about Lucene  
and it's ability to scale extremely well.  Hits caches up to 200  
documents (I believe) and uses a mechanism to score single documents  
at a time and only keep the top scoring ones.

There is no problem for Lucene to search and have Hits with a massive  
size.

There are memory considerations with sorting, though - these are  
described in detail in the javadocs and a little in LIA.

> 1. How to use a compound scoring at search-time (where
> you suggested a Query-subclass, but what/how?)

I'm going to defer to others to assist with this, or validate that  
this is the right approach in this situation.

> 2. Space concern about large search result set.

With a Query subclass, this shouldn't be a concern.  With sorting  
using Lucene's Sort there are some memory concerns, but less so than  
with your own TreeSet.

> P.S. Feel free to reply to the list, if you think this
> has general appeal and others may benefit.

Done!

     Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message