lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: incremental indexing - efficiency
Date Wed, 23 Mar 2005 20:50:51 GMT
That sounds a little low.  I'll assume that you profiled your whole
application, which leaves room for something else slowing things down,
and I'll suggest you write a standalone application whose only task is
to index documents as quickly as possible.  Hm, this reminds me that
I've written stuff like that twice, once for O'Reilly's OnJava site:
http://www.onjava.com/lpt/a/3273 and once for Lucene in Action.  Both
contain free code, which should help you figure out if it's really
Lucene or your other code.

If you write something that's standalone I could run it on my machine
and report the results.

Oh, and check this: http://lucene.apache.org/java/docs/benchmarks.html

Otis


--- sunil goyal <sunilgoyal@gmail.com> wrote:
> Hello all,
> 
> I am trying to use Lucene for doing incremental indexing of the order
> of million of records daily using a single machine (P4 2.4Ghz 1 GB
> RAM). I do get messages updated every few minutes based on which I
> need to update the index.
> 
> I am using a StandardAnalyzer and writing documents using IndexWriter
> (FSDirectory) using the following structure:
> 
> Document document = new Document();
> document.add(Field.Keyword(INDEX_MESG_ID,
> LongField.longToString(mesg_id)));
> document.add(Field.UnStored(INDEX_MESG_TITLE, mesgTitle));
> document.add(Field.UnStored(INDEX_MESG_DESCRIPTION,mesgDescription));
> document.add(Field.Keyword(INDEX_MESG_DATE,mesgDate));
> 
> mesgTitle is string of the order of 20-50 words
> mesgDescription is a string of the order of 500-1000 words.
> 
> I profiled my application and the index writer is able to write 16
> messages per second, which is too low for the order that I want to
> achieve. I have tried various options:
> - increasing mergeFactor and minMergeDocs from 10 to 100 to 1000
> didn't help.
> - Changed indexwriter from FSDirectory to RAMDirectory and later
> synchronizing changes to FSDirectory. Even this didn't help.
> 
> It will be great if someone can give me pointers for increasing the
> efficiency of index writer process. Can someone give me practical
> examples of how much maximum efficiency can be achieved for similar
> process on a single machine?
> 
> Thanks
> 
> Regards
> Sunil
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message