lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sol myr <solmy...@yahoo.com>
Subject Re: Newbie question: optimized files?
Date Wed, 12 Jan 2011 08:34:24 GMT
That was it - thank :)

--- On Tue, 1/11/11, Simon Willnauer <simon.willnauer@googlemail.com> wrote:

From: Simon Willnauer <simon.willnauer@googlemail.com>
Subject: Re: Newbie question: optimized files?
To: java-user@lucene.apache.org
Date: Tuesday, January 11, 2011, 12:06 AM

Hey,

this looks like you are hitting the optimization done in LUCENE-2773
(https://issues.apache.org/jira/browse/LUCENE-2773) that prevents
merged segments that are larger than 10% (by default - see
LogMergePolicy#noCFSRatio) of the index size it will be left in
non-compound format even if compound format is on.

for some background see:
http://lucene.apache.org/java/3_0_3/changes/Changes.html#3.0.3.changes_in_runtime_behavior

so its a feature not a bug :)

Simon

On Tue, Jan 11, 2011 at 7:46 AM, sol myr <solmyr72@yahoo.com> wrote:
> Hi,
>
> Continuing my question - I now suspect a bug in Lucene 3.0.3, because I ran the test
with Lucene 3.0.0 and it worked okay (no junk files)... could anyone please confirm?
>
> --- On Mon, 1/10/11, sol myr <solmyr72@yahoo.com> wrote:
>
> From: sol myr <solmyr72@yahoo.com>
> Subject: Newbie question: optimized files?
> To: java-user@lucene.apache.org
> Date: Monday, January 10, 2011, 9:56 AM
>
> Hi,
>
> I'm new to Lucene (using 3.0.3), and just started to check out the behavior of the 'optimize()'
method (which is quite important for our application).
> Could it be that 'optimize' cancels out the 'compoundFile' mode? Or am I doing something
wrong?
>
> Here's my test: I create an indexWriter with compoundFile=true, then perform some writes+commits
(which generates several 'cfs' files).
> Then I call 'optimize()'... I expected this to yield a single optimized 'cfs' file, but
instead I get lots of different files - 'fdt', 'fdx', 'fnm' etc...
>
> The detailed code:
> // Create indexWriter with 'compoundfile=true':
> Version version=Version.LUCENE_30;
> Directory dir = FSDirectory.open(new File("c:/luceneTemp"));
> Analyzer analyzer = new StandardAnalyzer(version);
> IndexWriter writer = new IndexWriter(
>          dir, analyzer, true, IndexWriter.MaxFieldLength.LIMITED);
> writer.setUseCompoundFile(true);
>
> // Perform some writes + commits.
> // This yields several 'cfs' files as expected:
> writer.addDocument(...);
> writer.commit();
> writer.addDocument(...);
>
> writer.commit();
> Thread.sleep(20000);  // give myself time to see the generated 'cfs' files
>
> // Optimize - why does it yield lots of separate files ('fdt', 'fdx', etc)?
> writer.optimize();
>
>
> Thanks.
>
>
>
>
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org




      
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message