lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Vasilev <>
Subject Re: feedback: Indexing speed improvement lucene 2.2->2.3.1
Date Fri, 21 Mar 2008 15:25:04 GMT
Hi Uwe,

Could you tell what Analyzer do you use when you marked so big indexing 
If you use StandardAnalyzer (that uses StandardTokenizer) may be the 
reason is in it. You can see the pre last report in the thread "Indexing 
Speed: 2.3 vs 2.2 (real world numbers)". According to the reporter Jake 
Mannix this is because now StandardTokenizer uses StandardTokenizerImpl 
that now is generated by JFlex instead of JavaCC.
I am asking because I noticed a great speedup in adding documents to 
index in our system. We have time control on this in the debug mode. NOW 
But in the same time the total process of indexing in our case has 
improvement of about 8%. As our system is very big and complex I am 
wondering if really the whole process of indexing is reduces so 
remarkably and our system causes this slowdown or may be Lucene does 
some optimizations on the index, merges or something else and this is 
the reason the total process of indexing to be not so reasonably faster.

Best Regards,

Uwe Goetzke wrote:
> This week I switched the lucene library version on one customer system.
> The indexing speed went down from 46m32s to 16m20s for the complete task
> including optimisation. Great Job!
> We index product catalogs from several suppliers, in this case around
> 56.000 product groups and 360.000 products including descriptions were
> indexed.
> Regards
> Uwe
> -----------------------------------------------------------------------
> Healy Hudson GmbH - D-55252 Mainz Kastel
> Geschaftsfuhrer Christian Konhauser - Amtsgericht Wiesbaden HRB 12076
> Diese Email ist vertraulich. Wenn Sie nicht der beabsichtigte Empfanger sind, durfen
Sie die Informationen nicht offen legen oder benutzen. Wenn Sie diese Email durch einen Fehler
bekommen haben, teilen Sie uns dies bitte umgehend mit, indem Sie diese Email an den Absender
zuruckschicken. Bitte loschen Sie danach diese Email.
> This email is confidential. If you are not the intended recipient, you must not disclose
or use this information contained in it. If you have received this email in error please tell
us immediately by return email and delete the document.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:
> __________ NOD32 2913 (20080301) Information __________
> This message was checked by NOD32 antivirus system.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message