lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Charlie Hubbard <charlie.hubb...@gmail.com>
Subject Re: Help running out of files
Date Mon, 02 Jan 2012 16:38:48 GMT
I'm beginning to think there is an issue with 3.1 that's causing this.
 After looking over my code again I forgot that the mechanism that does the
indexing hasn't changed, and the index IS being closed between cycles.
 Even when using push vs pull.  This code used to work on 2.x lucene, but I
had to upgrade it.  It had been very stable under 2.x, but after upgrading
to 3.1 I've started seeing this problem.  I double checked the code doing
the indexing, and it hasn't changed since I upgraded to 3.1.  So the
constant in this equation is mostly my code.  What's different is 3.1.
 Furthermore, when new documents are pulled in through the
old mechanism the open file count continues to rise.  Over a 24 hours
period it's grown by +296 files, but only 10 or 12 documents indexed.

So is this a known issue?  Should I upgrade to newer version to fix this?

Thanks
Charlie

On Sat, Dec 31, 2011 at 1:01 AM, Charlie Hubbard
<charlie.hubbard@gmail.com>wrote:

> I have a program I recently converted from a pull scheme to a push scheme.
>  So previously I was pulling down the documents I was indexing, and when I
> was done I'd close the IndexWriter at the end of each iteration.  Now that
> I've converted to a push scheme I'm sent the documents to index, and I
> write them.  However, this means I'm not closing the IndexWriter since
> closing after every document would have poor performance.  Instead I'm
> keeping the IndexWriter open all the time.  Problem is after a while the
> number of open files continues to rise.  I've set the following parameters
> on the IndexWriter:
>
> merge.factor=10
> max.buffered.docs=1000
>
> After going over the api docs I thought this would mean it'd never create
> more than 10 files before merging those files into a single file, but it's
> creating 100's of files.  Since I'm not closing the IndexWriter will it
> merge the files?  From reading the API docs it sounded like merging happens
> regardless of flushing, commit, or close.  Is that true?  I've measured the
> files that are increasing, and it's files associated with this one index
> I'm leaving open.  I have another index that I do close periodically, and
> its not growing like this one.
>
> I've read some posts about using commit() instead of close() in situations
> like this because its faster performance.  However, commit() just flushes
> to disk rather than flushing and optimizing like close().  Not sure
> commit() is what I need or not.  Any suggestions?
>
> Thanks
> Charlie
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message