lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <Bill.Che...@sungard.com>
Subject RE: Search result ordering
Date Wed, 29 Apr 2009 19:19:38 GMT
Thanks Erick,

Basically, the ideal ordering is an alphabetical one based on a String value that is known
at index creation.  I was just wondering if there was anything I could do at index creation
time that might help me enforce that ordering at query time (without using a Sort).  To be
honest, I haven't had to deal much w/ scoring in my work w/ Lucene.  Our app just searches
based on some set of criteria and the results returned are all pretty much equal.  However,
we want to list them alphabetically so they don't appear in a jumbled, seemingly random order.
 

Your comment about boost and scoring prompted me to read up a bit on it and it got me wondering
if maybe boost could be used somehow.  E.g. once all the docs have been added to the index
and I know the order I want, I could go back and set the boost for each document accordingly.
 But maybe this is a naïve or innapriate use of boost.

Thanks for the tip on Hits.  We're using 2.2.0 and should probably upgrade.  We're at a point
in development, though, where we want to keep new variables to a minimum.  Interestingly I
don't see much of a difference when paging through hits 1-10 vs. hits 300-310.  They all seem
to take about the same time to evaluate.  I'll try using one of the HitCollectors as you suggest
to see if it makes a difference.

regards,

--
Bill Chesky 

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: Wednesday, April 29, 2009 1:46 PM
To: java-user@lucene.apache.org
Subject: Re: Search result ordering

People (including me) use Lucene to page through results all the time,
so I'm pretty sure you're OK.

so here's my answers...
(1) yes.
(2) Well, the default sort is by score so if you want some other
     ordering you have to sort.
(3) You can boost things at index time, but I don't think that's at all
     relevant. What order are you trying to enforce that you know
     enough about at index time to specify?

Do note, though, that Hits is deprecated. The problem is that
Hits was intended to be reasonable *only* when accessing the first
few documents. If you're paging far into the result set, be aware that
using Hits will re-execute the entire query every 100 (200?) results you
throw away. Think about one of the HitCollectors
(perhaps TopDocCollector) instead.

Best
Erick


On Wed, Apr 29, 2009 at 1:01 PM, <Bill.Chesky@sungard.com> wrote:

> Hello,
>
> I have a few questions about the ordering of search results:
>
> 1) Given a query, are the Documents contained in the Hits object that is
> returned by IndexSearcher.search(Query query) guaranteed to be in the
> same order from one call to the next (assuming the index has not been
> updated in the meantime)?
> 2) Assuming I don't use the IndexSearcher.search(Query query, Sort sort)
> method, is the ordering of Documents in the Hits object predictable at
> all?
> 3) Short of using the IndexSearcher.search(Query query, Sort sort), is
> there any way to influence the ordering of the Documents in the Hits
> object?  E.g. is there anything that I can do when creating and/or
> updating the index that will guarantee a certain ordering of results at
> query time?
>
> For what it's worth, we're using Lucene in conjuction w/ a relational
> database.  Our index has an 'id' field that maps to a row in a
> relational database table.  Since Lucene queries are so quick we've
> found performance to be much better if we do a pure Lucene query to find
> the docs we need then do a simple SQL query with a "where id in (...)"
> clause.   We've also wrapped this in an interface that implements a
> relational-like limit/offset functionality.  So it's important to us
> that at the very least the query returns results in the same order each
> time.  Ideally, we'd like to order the results on a String field we have
> on each document.  This actually works, however it slows things down a
> bit.  This is understandable, course -- and I'm actually very impressed
> how quickly it does perform -- however, we're trying to squeeze as many
> cycles as possible out it to give the best user experience.   Just
> wondering if there is anything else we might try.
>
> thanks,
>
> Bill
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message