lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "chan kang" <>
Subject Re: Does Optimize preserve index order?
Date Wed, 29 Mar 2006 13:47:42 GMT
Thank you~
The sorting doesn't seem to take that long (not as long as I expected),
but unfortunately didn't get to measure it this time... Maybe next time I'l
try measuring..

Now I've got another problem..
My final goal is to keep the index sorted reverse-chronologically,
so that, when searching, the relevant documents are shown in the
reverse-chronological order(the most recent document at the top) even
without sorting.
Although presorting the index in chronological order is easy (just
addDocument() for
each new incoming document, and optimize), the reverse seems to be
The way I'm handling it now is to

> 1. index without ordering.

 2. sort the index reverse-chronologically
> 3. re-index and optimize.
> 4. when a new document comes in, do steps 1-3 again..

Steps 1-3 is not that different from sorting in chronological order, but
when it comes to step4, the process becomes very much redundant.
I mean, for example if I wanted to show every search results in a sorted
way, so that
the most recent document comes to the top, I would have to go through steps
every time when a new document is added(by crawling the web or whatever..).

So, i thought, if the following was possible, it would be much easier...
1. create a new index for incoming documents
2. sort it reverse-chronologically -> index_new
3. use addIndexes() and do "index_new.addIndexes(old_index)"
4. optimize

That way, the new index is sorted, and the old index(which is much much
larger than
incoming ones) is also sorted, and two sorted indexes can be merged to make
a final
sorted version, and this means not re-indexing the whole set of documents in
the original index.
However, I'm not sure whether the addIndexes() also preserves order.
Is it?

Also, is there a better way to do this?

Thanks in advance.


2006/3/28, Yonik Seeley <>:
> On 3/24/06, chan kang <> wrote:
> > What I want to do is to show the results in
> > chronological order. (btw, the index contains the time field)
> > One solution I have thought up was:
> > 1. index the whole set
> > 2. read in all the time field values
> > 3. re-index the whole set according to time
> >    (heard that the index order is same as insertion order)
> > 4. optimize.
> >
> >
> > However, although I think the step 3 would result
> > in a sorted index, isn't there a possibility that
> > step 4 might ruin all the sortedness?
> > - Wouldn't optimizing break the order in which they
> >   are indexed?
> Index order is retained, so your plan should work fine.
> How long is sorting actually taking?  FYI, the first time you sort on
> a field will take much longer because a fieldcache entry must be
> populated.
> -Yonik
> Solr, The Open Source Lucene Search
> Server
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message