lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Goetzke" <uwe.goet...@healy-hudson.com>
Subject AW: How can I merge .cfx and .cfs into a single cfs file?
Date Wed, 05 May 2010 07:57:17 GMT
Index all into a directory and determine the size of all files in it.

>From http://lucene.apache.org/java/3_0_1/fileformats.html 
Starting with Lucene 2.3, doc store files (stored field values and term vectors) can be shared
in a single set of files for more than one segment. When compound file is enabled, these shared
files will be added into a single compound file (same format as above) but with the extension
.cfx.

In addition to
Compound File  	.cfs  	An optional "virtual" file consisting of all the other index files
for systems that frequently run out of file handles.

Uwe


-----Ursprüngliche Nachricht-----
Von: 张志田 [mailto:zhitian.zhang@dianping.com] 
Gesendet: Mittwoch, 5. Mai 2010 08:24
An: java-user@lucene.apache.org
Betreff: How can I merge .cfx and .cfs into a single cfs file?

Hi all,

I have an index task which will index thousands of records with lucene 3.0.1. My confusion
is lucene will always create a .cfx and a .cfs file in the file system, sometimes more, while
I thought it should create a single .cfs file if I optimize the index data. Is it by design?
If yes, is there any way/configuration I can do to merge all of the index files into a singe
one?

By the way, I have a logic to validate the index data, if the size of .cfs increases dramatically
comparing to the file generated last time, there may be something wrong, a warning message
will be threw. This is the reason that I want to generate a single .cfs file. Any other suggestion
about the index validation?

Any body can give me a hand?

Thanks in advance.

Garry

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message