lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Merge factor problem,
Date Sat, 10 Feb 2007 15:49:27 GMT
Maybe it's Lucene.Net-specific - you are on java-user mailing list.

Otis

----- Original Message ----
From: Sairaj Sunil <sairaj.sunil@gmail.com>
To: java-user@lucene.apache.org
Sent: Saturday, February 10, 2007 10:26:01 AM
Subject: Re: Merge factor problem,

Hi,
I saw that article and it tells me that increasing the mergeFactor speeds up
the indexing. But the reverse had happened in my case.
To be more specific I had conducted some experiments for 1000 documents. The
time taken is quite large, due to pdf file indexing. I had changed the
indexwriter's parameters.

MergeFactor – default(10)
minMergeDocs – default(10)
Time taken – 690 sec

MergeFactor – 50
minMergeDocs – default(10)
Time taken – 765 sec
MergeFactor – default(10)
minMergeDocs – 100
Time taken – 670 sec

MergeFactor –100
minMergeDocs – 100
Time taken – 738 sec
Increasing the mergeFactor did not speed up, but increasing the minMergeDocs
had improved. I am using Lucene.Net.
Can you explain the behavior. I am confused

On 2/10/07, Otis Gospodnetic <otis_gospodnetic@yahoo.com> wrote:
>
> Sairaj, see http://www.onjava.com/pub/a/onjava/2003/03/05/lucene.html
>
> Increase your maxBufferedDocs.
>
> Otis
>
> ----- Original Message ----
> From: Sairaj Sunil <sairaj.sunil@gmail.com>
> To: java-user@lucene.apache.org
> Sent: Friday, February 9, 2007 11:14:50 AM
> Subject: Merge factor problem,
>
> Hi all,
> I have increased the merge factor from 10 to 50. I thought the indexing
> performance will be better. But the time taken taken to index is more than
> the time taken for the merge factor of 10. The documentation and some
> articles say that the time taken to index will improve if the merge factor
> is increased.
> I have changed the merge factors to 50, 100, 1000. I have left the
> minMergeDocs to be the default value for all the cases. The time taken to
> index same number of documents increased in a linear fashion, which is
> exactly opposite according to the info I have read.
> Is this the correct behavior. In which cases this behavior happens?
>
> Regards
> --
> Sairaj Sunil
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
Sairaj Sunil
II Mtech(CS)
SSSIHL
Prashanthi Nilayam




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message