lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simeon Koptelov <>
Subject Re: Re: Document numbers and ids
Date Fri, 04 Feb 2005 17:24:58 GMT
> By "renumbered", it means it squeezes out holes left by deletes.  The 
> actual order does not change and thus does not affect a Sort.INDEXORDER 
> sort.
> Documents are stored in the index in the order that they were indexed - 
> nothing changes this order.  Document id's are not permanent if deletes 
> occur followed by an optimize.

Thanks for clarification, Erik. Could you answer one more question: can I 
control the assignment of document numbers during indexing? It would be very 
handy for me to have categories of documents aligned on some boudaries, e.g. 
category N numbers start on  N*10000. Obviously, there will be some holes in 
numeration with this scheme.

Maybe I should explain, why I'm asking. 
I'm searching for documents, but for most (almost all) of them I don't really 
care about their content. I only want to know a particular numeric field from 
document (id of document's category). 
I also need to know how many docs in category were found, so I can't index 
categories instead of docs. 
The result set can be pertty big (30K) and all must be handled in inner loop. 
So I wanna use HitCollector and assign intervals of ids to categories of 
documents. Following this way, there's no need to actually retrieve document 
in inner loop. 

Am I on the right way?

Mood: wondering, why SQL GROUP BY works so fast.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message