lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <>
Subject Re: incremental indexing - efficiency
Date Wed, 23 Mar 2005 20:50:51 GMT
That sounds a little low.  I'll assume that you profiled your whole
application, which leaves room for something else slowing things down,
and I'll suggest you write a standalone application whose only task is
to index documents as quickly as possible.  Hm, this reminds me that
I've written stuff like that twice, once for O'Reilly's OnJava site: and once for Lucene in Action.  Both
contain free code, which should help you figure out if it's really
Lucene or your other code.

If you write something that's standalone I could run it on my machine
and report the results.

Oh, and check this:


--- sunil goyal <> wrote:
> Hello all,
> I am trying to use Lucene for doing incremental indexing of the order
> of million of records daily using a single machine (P4 2.4Ghz 1 GB
> RAM). I do get messages updated every few minutes based on which I
> need to update the index.
> I am using a StandardAnalyzer and writing documents using IndexWriter
> (FSDirectory) using the following structure:
> Document document = new Document();
> document.add(Field.Keyword(INDEX_MESG_ID,
> LongField.longToString(mesg_id)));
> document.add(Field.UnStored(INDEX_MESG_TITLE, mesgTitle));
> document.add(Field.UnStored(INDEX_MESG_DESCRIPTION,mesgDescription));
> document.add(Field.Keyword(INDEX_MESG_DATE,mesgDate));
> mesgTitle is string of the order of 20-50 words
> mesgDescription is a string of the order of 500-1000 words.
> I profiled my application and the index writer is able to write 16
> messages per second, which is too low for the order that I want to
> achieve. I have tried various options:
> - increasing mergeFactor and minMergeDocs from 10 to 100 to 1000
> didn't help.
> - Changed indexwriter from FSDirectory to RAMDirectory and later
> synchronizing changes to FSDirectory. Even this didn't help.
> It will be great if someone can give me pointers for increasing the
> efficiency of index writer process. Can someone give me practical
> examples of how much maximum efficiency can be achieved for similar
> process on a single machine?
> Thanks
> Regards
> Sunil
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message