lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wettin <karl.wet...@gmail.com>
Subject Re: Lucene Partition Size
Date Fri, 09 Apr 2010 13:39:02 GMT
It's hard for me to say why this is slow.

Here are a few more questions whose anwers might provide further clues:

What was the reasons that led you to partition the index this way?
What does the searcher implementation look like?
What would a typcial query sent to that searcher look like?

I would start by loading the full 400Gb as a single index on local disc.


	karl

8 apr 2010 kl. 22.07 skrev Ivan Provalov:

> Karl,
>
> We have not done the same scale local-disk test.  Our network  
> parameters are
>
> -  Network speed:  1gb
> -  3 partitions per volume
> -  The volumes are accessed via NFS to EMC Celera devices. (NFS 3)
> -  The drives are 300 gb fiber attached with 10,000 rpm.
>
> Thanks,
>
> Ivan
>
> --- On Thu, 4/8/10, Karl Wettin <karl.wettin@gmail.com> wrote:
>
>> From: Karl Wettin <karl.wettin@gmail.com>
>> Subject: Re: Lucene Partition Size
>> To: java-user@lucene.apache.org
>> Date: Thursday, April 8, 2010, 2:44 PM
>>
>> 8 apr 2010 kl. 20.05 skrev Ivan Provalov:
>>
>>> We are using Lucene for searching of 200+ mln
>> documents (periodical publications).  Is there any
>> limitation on the size of the Lucene index (file size,
>> number of docs, etc...)?
>>
>> The only such limitation in Lucene I'm aware of is
>> Integer.MAX_VALUE documents. This might also be true for
>> number of terms.
>>
>>> We are partitioning the indexes at about 10 mln
>> documents per partition (each partition is on a separate
>> box, some mounted volumes are shared among three-four
>> partitions).  We have index size around 10% of the
>> content size.  The total content is around 4Tb and the
>> index around 400Gb.  This content is planned to be
>> split into 20 partitions (10mln docs, 200Gb content size,
>> 20Gb index size).  We are using a memory mapped index
>> directory implementation.  Our testing is done with 600
>> concurrent users.
>>>
>>> We are seeing consistently high response times from
>> the partitions (4-5 seconds).  Is there a number of
>> documents per partition limitation in Lucene for this
>> particular scenario?
>>
>> I'm not sure if I got this right but it sounds like your
>> index is mounted over network? Can you tell us some more
>> details about that? What speeds do you see if you put the
>> index on local disc?
>>
>>
>>
>>     karl
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message