lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki>
Subject Re: obtaining the number of documents stored in a .cfs file
Date Tue, 05 Sep 2006 16:20:23 GMT
Stanislav Jordanov wrote:
> Suppose I have a bunch of valid .cfs files while the 
> segmens/ file is missing or invalid.
> The task is to 'recover' the present .cfs files into a valid index.
> I think it will be necessary and sufficient to create a segments file 
> that references the .cfs files.
> The only problem I've encountered in generating a vaild and 
> well-formed segments file is that I need to know the number of docs in 
> each cfs file.
> So the couple of questions is:
> Do I have to put the right number of docs for each segments or any 
> (dummy) number will do?

Not sure, but I doubt anything else than a valid number would work.

> If I have to put the right number there, how do I get it having the 
> cfs file?

Look at the size of _xx.f1 file inside CFS file; this is the norms file, 
and its size in bytes is the same as the number of documents in the index.

(You can use CompoundFileReader.list() and fileLength() methods).

Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration  Contact: info at sigram dot com

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message