lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <goks...@gmail.com>
Subject Re: Hardware Specs Question
Date Thu, 02 Sep 2010 01:37:54 GMT
I was just reading about configuring mass computation grids: hardware
writes on 2 striped disks take 10% than writes on a single disk,
because you have to wait for the slower disk to finish. So, single
disks without RAID are faster.

I don't know how much SSD disks cost, but they will certainly cure the
disk i/o problem.

On Tue, Aug 31, 2010 at 1:35 AM, scott chu (朱炎詹) <scott.chu@udngroup.com> wrote:
> In our current lab project, we already built a Chinese newspaper index with
> 18 millions documents. The index size is around 51GB. So I am very concerned
> about the memory issue you guys mentioned.
>
> I also look up the Hathitrust report on SolrPerformanceData page:
> http://wiki.apache.org/solr/SolrPerformanceData. They said their main
> bottleneck is Disk-I/O even they have 10 shards spread over 4 servers.
>
> Can you guys give me some helpful suggestion about hardward spec & memory
> configuration on our project?
>
> Thanks in advance.
>
> Scott
>
> ----- Original Message ----- From: "Lance Norskog" <goksron@gmail.com>
> To: <solr-user@lucene.apache.org>
> Sent: Tuesday, August 31, 2010 1:01 PM
> Subject: Re: Hardware Specs Question
>
>
> There are synchronization points, which become chokepoints at some
> number of cores. I don't know where they cause Lucene to top out.
> Lucene apps are generally disk-bound, not CPU-bound, but yours will
> be. There are so many variables that it's really not possible to give
> any numbers.
>
> Lance
>
> On Mon, Aug 30, 2010 at 8:34 PM, Amit Nithian <anithian@gmail.com> wrote:
>>
>> Lance,
>>
>> makes sense and I have heard about the long GC times on large heaps but I
>> personally haven't experienced a slowdown but that doesn't mean anything
>> either :-). Agreed that tuning the SOLR caching is the way to go.
>>
>> I haven't followed all the solr/lucene changes but from what I remember
>> there are synchronization points that could be a bottleneck where adding
>> more cores won't help this problem? Or am I completely missing something.
>>
>> Thanks again
>> Amit
>>
>> On Mon, Aug 30, 2010 at 8:28 PM, scott chu (朱炎詹)
>> <scott.chu@udngroup.com>wrote:
>>
>>> I am also curious as Amit does. Can you make an example about the garbage
>>> collection problem you mentioned?
>>>
>>> ----- Original Message ----- From: "Lance Norskog" <goksron@gmail.com>
>>> To: <solr-user@lucene.apache.org>
>>> Sent: Tuesday, August 31, 2010 9:14 AM
>>> Subject: Re: Hardware Specs Question
>>>
>>>
>>>
>>> It generally works best to tune the Solr caches and allocate enough
>>>>
>>>> RAM to run comfortably. Linux & Windows et. al. have their own cache
>>>> of disk blocks. They use very good algorithms for managing this cache.
>>>> Also, they do not make long garbage collection passes.
>>>>
>>>> On Mon, Aug 30, 2010 at 5:48 PM, Amit Nithian <anithian@gmail.com>
>>>> wrote:
>>>>
>>>>> Lance,
>>>>>
>>>>> Thanks for your help. What do you mean by that the OS can keep the
>>>>> index
>>>>> in
>>>>> memory better than Solr? Do you mean that you should use another means
>>>>> to
>>>>> keep the index in memory (i.e. ramdisk)? Is there a generally accepted
>>>>> heap
>>>>> size/index size that you follow?
>>>>>
>>>>> Thanks
>>>>> Amit
>>>>>
>>>>> On Mon, Aug 30, 2010 at 5:00 PM, Lance Norskog <goksron@gmail.com>
>>>>> wrote:
>>>>>
>>>>> The price-performance knee for small servers is 32G ram, 2-6 SATA
>>>>>>
>>>>>> disks on a raid, 8/16 cores. You can buy these servers and half-fill
>>>>>> them, leaving room for expansion.
>>>>>>
>>>>>> I have not done benchmarks about the max # of processors that can
be
>>>>>> kept busy during indexing or querying, and the total numbers: QPS,
>>>>>> response time averages & variability, etc.
>>>>>>
>>>>>> If your index file size is 8G, and your Java heap is 8G, you will
do
>>>>>> long garbage collection cycles. The operating system is very good
at
>>>>>> keeping your index in memory- better than Solr can.
>>>>>>
>>>>>> Lance
>>>>>>
>>>>>> On Mon, Aug 30, 2010 at 4:52 PM, Amit Nithian <anithian@gmail.com>
>>>>>> wrote:
>>>>>> > Hi all,
>>>>>> >
>>>>>> > I am curious to know get some opinions on at what point having
more
>>>>>> > >  >
>>>>>> CPU
>>>>>> > cores shows diminishing returns in terms of QPS. Our index size
is >
>>>>>> about
>>>>>> 8GB
>>>>>> > and we have 16GB of RAM on a quad core 4 x 2.4 GHz AMD Opteron
2216.
>>>>>> > Currently I have the heap to 8GB.
>>>>>> >
>>>>>> > We are looking to get more servers to increase capacity and
because
>>>>>> > >  >
>>>>>> the
>>>>>> > warranty is set to expire on our old servers and so I was curious
>
>>>>>> before
>>>>>> > asking for a certain spec what others run and at what point
does >
>>>>>> having
>>>>>> more
>>>>>> > cores cease to matter? Mainly looking at somewhere between 4-12
>
>>>>>> > cores
>>>>>> > per
>>>>>> > server.
>>>>>> >
>>>>>> > Thanks!
>>>>>> > Amit
>>>>>> >
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Lance Norskog
>>>>>> goksron@gmail.com
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Lance Norskog
>>>> goksron@gmail.com
>>>>
>>>>
>>>
>>>
>>>
>>> --------------------------------------------------------------------------------
>>>
>>>
>>>
>>> ___b___J_T_________f_r_C
>>> Checked by AVG - www.avg.com
>>> Version: 9.0.851 / Virus Database: 271.1.1/3102 - Release Date: 08/30/10
>>> 14:35:00
>>>
>>>
>>
>
>
>
> --
> Lance Norskog
> goksron@gmail.com
>
>
>
> --------------------------------------------------------------------------------
>
>
>
> ___b___J_T_________f_r_C
> Checked by AVG - www.avg.com
> Version: 9.0.851 / Virus Database: 271.1.1/3103 - Release Date: 08/31/10
> 02:34:00
>
>



-- 
Lance Norskog
goksron@gmail.com

Mime
View raw message