hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fernando Padilla <f...@alum.mit.edu>
Subject Re: hbase/zookeeper
Date Fri, 17 Jul 2009 15:21:59 GMT
OK, if you don't mind me stretching this simple conversation a bit more..

Say I use the medium ec2 instance.. that's about 7.5G of ram, so I have 
abgout 6.5 total.

On any one node I would have:

+Map/Reduce Tasks?

What would your gut be for distributing the memory?

Can I run my M/R Tasks all sharing one JVM to share the same memory, or 
does each Map or Reduce have it's own JVM/Memory requirements?

I'm thinking between 5 to 10 nodes.  I know that this seems stingy for 
what you guys are used to.. but this is my worst case or minimum 
allocation.. if need be I can plan to get more nodes and spread around 
the load (bursting on heavy days, etc).. but I don't want to plan/budget 
for a large number of nodes until we see good ROI, etc etc etc..

On 7/14/09 11:54 PM, Nitay wrote:
> Yes, Ryan's right. While we recommend running ZooKeeper on separate hosts,
> it is really only if you can afford to do so. Otherwise, choose some of your
> region server machines and run ZooKeeper alongside those.
> On Tue, Jul 14, 2009 at 10:34 PM, Ryan Rawson<ryanobjc@gmail.com>  wrote:
>> You can probably host it all on one set of machines.  You'll need the
>> large sized.
>> Let us know how EC2 works, performance might be off due to the
>> virtualization.
>> On Tue, Jul 14, 2009 at 10:32 PM, Fernando Padilla<fern@alum.mit.edu>
>> wrote:
>>> The reason I ask, is that I'm planning on setting up a small HBase
>> cluster
>>> in ec2..
>>> having 3 to 5 instances just for zookeeper, while having only 3 to 5
>>> instances for Hbase.. it sounds lop-sided. :)
>>> Does anyone here have any experience with HBase in EC2?
>>> Ryan Rawson wrote:
>>>> I run my ZK quorum on my regionservers, but I also have 16 GB ram per
>>>> regionserver.  I used to run 1gb, and never had problems. Now with
>>>> hbase managing the quorum I have 5gb ram, and its probalby over kill
>>>> but better save than sorry.
>>>> On Tue, Jul 14, 2009 at 6:07 PM, Nitay<nitayj@gmail.com>  wrote:
>>>>> Hi Fernando,
>>>>> It is recommended that you run ZooKeeper separate from the Region
>>>>> Servers.
>>>>> On the memory side, our use of ZooKeeper in terms of data stored is
>>>>> minimal
>>>>> currently. However you definitely don't want it to swap and you want
>>>>> be
>>>>> able to handle a large number of connections. A safe value would be
>>>>> something like 1GB.
>>>>> -n
>>>>> On Tue, Jul 14, 2009 at 2:58 PM, Fernando Padilla<fern@alum.mit.edu>
>>>>> wrote:
>>>>>> So.. what's the recommendation for zookeeper?
>>>>>> should I run zookeeper nodes on the same region servers?
>>>>>> should I run zookeeper nodes external to the region servers?
>>>>>> how much memory should I give zookeeper, if it's just used for hbase?

View raw message