cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicolas Maisonneuve <n.maisonne...@gmail.com>
Subject Re: luceneINdexTransformer not optimized
Date Tue, 16 Nov 2004 23:03:06 GMT
see http://issues.apache.org/bugzilla/show_bug.cgi?id=32263


On Tue, 16 Nov 2004 11:04:39 +0000, Jeremy Quinn
<jeremy@media.demon.co.uk> wrote:
> Dear Nicolas
> 
> If you were to provide a patch and send it to bugzilla (then notify me
> of the bug #) I would be happy to review it.
> 
> regards Jeremy
> 
> 
> 
> 
> On 15 Nov 2004, at 23:34, Nicolas Maisonneuve wrote:
> 
> > the method to update a document is not optimized (reindexDocument
> > method). this actual behavior is :
> >
> > 1- open reader if not open (but in fact it's always closed because of
> > line  3)
> > 2-delete document
> > 3-close reader
> > 4-open writer
> > 5- write index
> > 6-close index
> >
> > (NOTE: with this behavior, the merge factor is useless because this
> > method index only one document for a opening of indexwriter)
> >
> > - A optimization in lucene is to avoid to open and close  indexreader
> > and indexwriter a lot of times.
> >
> > so i propose this simple optimization :
> > 1- open reader if not open
> > 2- delete document
> > 3-store lucene document in a buffer (Stack)
> >
> > // flush the buffer
> > if ((buffer % max_buffer)==0) {
> >
> >    // switch to write mode
> > 4-   close reader
> > 5-   open writer
> >    for (1 to max_buffer)  {
> > 6-      write
> >     }
> > 7- close writer
> > }
> >
> >
> > with this kind of method,
> > 1 -
> >  with a buffer of 100 doc, you divide the number of switching mode
> > (writ/read) to 100 , and the indexing is much much faster
> > 2- the merge factor is really useful because the indexwriter index
> > more than 1 document
> >
> >
> > i've developped a Index component with 2 implemenations
> > 1 indexerDefault with this kind of method
> > 2- MultiThreadIndexer optimized for multiple CPU
> >
> > maybe it  could be interesting to integred this components to the
> > lucene Block
> >
> > Nicolas Maisonneuve
> >
> >
> --------------------------------------------------------
> 
>                    If email from this address is not signed
>                                  IT IS NOT FROM ME
> 
>                          Always check the label, folks !!!!!
> --------------------------------------------------------
> 
> 
>

Mime
View raw message