cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Quinn <jer...@media.demon.co.uk>
Subject Re: luceneINdexTransformer not optimized
Date Tue, 16 Nov 2004 11:04:39 GMT
Dear Nicolas

If you were to provide a patch and send it to bugzilla (then notify me 
of the bug #) I would be happy to review it.

regards Jeremy


On 15 Nov 2004, at 23:34, Nicolas Maisonneuve wrote:

> the method to update a document is not optimized (reindexDocument
> method). this actual behavior is :
>
> 1- open reader if not open (but in fact it's always closed because of 
> line  3)
> 2-delete document
> 3-close reader
> 4-open writer
> 5- write index
> 6-close index
>
> (NOTE: with this behavior, the merge factor is useless because this
> method index only one document for a opening of indexwriter)
>
> - A optimization in lucene is to avoid to open and close  indexreader
> and indexwriter a lot of times.
>
> so i propose this simple optimization :
> 1- open reader if not open
> 2- delete document
> 3-store lucene document in a buffer (Stack)
>
> // flush the buffer
> if ((buffer % max_buffer)==0) {
>
>    // switch to write mode
> 4-   close reader
> 5-   open writer
>    for (1 to max_buffer)  {
> 6-      write
>     }
> 7- close writer
> }
>
>
> with this kind of method,
> 1 -
>  with a buffer of 100 doc, you divide the number of switching mode
> (writ/read) to 100 , and the indexing is much much faster
> 2- the merge factor is really useful because the indexwriter index
> more than 1 document
>
>
> i've developped a Index component with 2 implemenations
> 1 indexerDefault with this kind of method
> 2- MultiThreadIndexer optimized for multiple CPU
>
> maybe it  could be interesting to integred this components to the 
> lucene Block
>
> Nicolas Maisonneuve
>
>
--------------------------------------------------------

                   If email from this address is not signed
                                 IT IS NOT FROM ME

                         Always check the label, folks !!!!!
--------------------------------------------------------


Mime
View raw message