lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Sorting a Lucene index
Date Thu, 19 Aug 2010 13:48:25 GMT
You haven't yet told us how many documents you're talking about here, so
it's
hard to have a good idea of what solutions are. That said, I'd just try
sorting first.
The sorting cache size will be something like (sizeof(int or long)) *
(number of documents).
Measure (remember to measure the response after query warmups, the first
few will be slower because they fill up the cache) THEN fix iff there's a
problem.

And just forget the idea of inserting your documents in the correct order
<G>. You
stated that the documents come in in random order. Document IDs are assigned
internally to Lucene, and monotonically increasing. I sure don't see how you
can
reconcile those two things...

But again, just try it with sorting on the numeric field and only fix things
if you have a
problem. Lots of work has been put into making Lucene fast, by very bright
people. See
if they've already solved your problem for you...

Best
Erick.


On Thu, Aug 19, 2010 at 1:51 AM, Shelly_Singh <Shelly_Singh@infosys.com>wrote:

> Hi Anshum,
>
> I require sorted results for all my queries and the field on which I need
> sorting is fixed; so this lead to me the idea of storing in sorted order to
> avoid sorting cost with every query.
>
> Thanks and Regards,
>
> Shelly Singh
> Center For KNowledge Driven Information Systems, Infosys
> Email: shelly_singh@infosys.com
> Phone: (M) 91 992 369 7200, (VoIP)2022978622
>
> -----Original Message-----
> From: Anshum [mailto:anshumg@gmail.com]
> Sent: Wednesday, August 18, 2010 5:21 PM
> To: java-user@lucene.apache.org
> Subject: Re: Sorting a Lucene index
>
> Hi Shelly,
> The search results so returned are sorted either by relevance, index order,
> stored field, or custom order.
> As you are saying that you would not be able to maintain the index order,
>  you would have to do the sort at run time.
> Sorting on a stored field is not costly and you may use it comfortably.
> btw,
> are you facing any issues in sort time or is it a presumption?
>
> --
> Anshum Gupta
> http://ai-cafe.blogspot.com
>
>
> On Wed, Aug 18, 2010 at 5:12 PM, Shelly_Singh <Shelly_Singh@infosys.com
> >wrote:
>
> > Hi,
> >
> > I have a Lucene index that contains a numeric field along with certain
> > other fields. The order of incoming documents is random and
> un-predictable.
> > As a result, while creating an index, I end up adding docs in random
> order
> > with respect to the numeric field value.
> >
> > For example, documents may be added in following order:
> > 12,y,d
> > 100,o,p
> > 1,x,y
> > 23,u,i
> > 31,v,m
> > 22,b,m
> > 109,k,l
> >
> > My requirement is that at search time, I want the documents in order of
> the
> > numeric field.
> > One, option is to do a score/sort on the numeric field.
> > But, this may be a costly operation.
> >
> > Hence, I am trying to find if there is some way, such that, my stored
> index
> > is sorted by itself.
> >
> > Please help.
> >
> > Thanks and Regards,
> >
> > Shelly Singh
> > Center For KNowledge Driven Information Systems, Infosys
> > Email: shelly_singh@infosys.com<mailto:shelly_singh@infosys.com>
> > Phone: (M) 91 992 369 7200, (VoIP)2022978622
> >
> >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message