hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Imran M Yousuf <imyou...@gmail.com>
Subject Re: About test/production server configuration
Date Tue, 06 Apr 2010 00:24:03 GMT
On Tue, Apr 6, 2010 at 12:02 AM, Patrick Hunt <phunt@apache.org> wrote:
> The ZK servers are sensitive to disk (io) latency. I just troubleshot an
> issue last week where a user was seeing 80second (second!) latencies.
> Turns out they were running zk server, namenode, tasktracker, and hbase
> region server all on the same box, that box had a single spindle for all
> io activity and was at 100% utilization for long periods of time. If
> you want decent ZK API latencies (<100ms) you really need to ensure that
> there's at least a separate spindle available for the ZK transaction logs.
>

Great insight and info! So that means if ZK is using a separate
spindle the rest can reside in the same spindle, at least in your
case, nice!

Thanks for the info, noting it.

Imran

> Patrick
>
> On 04/05/2010 11:11 AM, Jonathan Gray wrote:
>>
>> Imran,
>>
>> It's impossible to give good advice on cluster size and hardware
>> configuration without some idea of the requirements.
>>
>> How much data?  How will the data be queried?  What kind of load do
>> you expect?  You are going to be doing offline batch/MapReduce,
>> online random access, as well as search all from the same nodes?
>> This can be dangerous.
>>
>> I would strongly recommend against putting Hadoop+HBase on the same
>> nodes as something like Solr, unless you have dedicated disks for
>> each.  Also, don't forget about ZooKeeper which you definitely will
>> need separate nodes/disks for if you will be co-locating so many
>> other things.
>>
>> JG
>>
>>> -----Original Message----- From: Imran M Yousuf
>>> [mailto:imyousuf@gmail.com] Sent: Monday, April 05, 2010 9:52 AM
>>> To: hbase-user@hadoop.apache.org Subject: About test/production
>>> server configuration
>>>
>>> Hi,
>>>
>>> We are a startup who have decided to use HBase purely because we
>>> want to take advantage of HDFS based reliability, redundancy,
>>> MapReduce and BigTable. For that we are thinking to go for a test
>>> environment with 5 servers and production environment with 10
>>> servers in both case the Hadoop cluster will be used for HBase +
>>> MapReduce + Solr Index.
>>>
>>> Firstly, I would like some opinion on whether 10 servers is a good
>>> number for all 3 purposes or not. Secondly what kind of test
>>> environment is currently in use in different organizations.
>>> Thirdly, I would like to learn some server configuration and
>>> purchase price (with purchase location if possible).
>>>
>>> Waiting eagerly for some feedback.
>>>
>>> Thank you,
>>>
>>> -- Imran M Yousuf Entrepreneur&  Software Engineer Smart IT
>>> Engineering Dhaka, Bangladesh Email: imran@smartitengineering.com
>>> Blog: http://imyousuf-tech.blogs.smartitengineering.com/ Mobile:
>>> +880-1711402557
>



-- 
Imran M Yousuf
Entrepreneur & Software Engineer
Smart IT Engineering
Dhaka, Bangladesh
Email: imran@smartitengineering.com
Blog: http://imyousuf-tech.blogs.smartitengineering.com/
Mobile: +880-1711402557

Mime
View raw message