hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bradford Stephens <bradfordsteph...@gmail.com>
Subject Re: Slow Inserts on EC2 Cluster
Date Wed, 01 Sep 2010 17:09:07 GMT
I think it's mostly a matter of cost-efficiency -- HBase *runs* just
fine on EC2, and is built to be in a transient environment. It's just
not always cost-effective because you have to use pricey instances.

As far as my issue -- it didn't seem to be ZK. I like Andrew's point,
I'll knock it up to bigger instances and see what's up.

-B

On Wed, Sep 1, 2010 at 10:04 AM, Jonathan Gray <jgray@facebook.com> wrote:
> While I completely agree with much of what you're saying, and am usually one of the first
to encourage people to not use virtual machines w/ HBase, I know of several successful deployments
of HBase on EC2.  In most instances there was some pain encountered, but it does work for
some.
>
> I've not seen these specific issues you seem to be running in to (periodically spiking
load but no cpu or iowait).
>
> I'm not sure I know what HBase could do to operate better in these environments.  I'm
not sure I understand exactly what is happening to RS and ZooKeeper when EC2 is being weird.
 You can't talk to ZK because of a networking issue?  Have you dug in to the ZK server logs
to see what's up?
>
> HBase is a highly available service, we need to do heartbeating of some kind, so lose
of network connectivity is a killer.
>
> It could also be that ZK is being starved of IO so that it cannot write to its transaction
log and that is what is slowing it down.
>
> JG
>
>> -----Original Message-----
>> From: Matthew LeMieux [mailto:mdl@mlogiciels.com]
>> Sent: Wednesday, September 01, 2010 7:25 AM
>> To: user@hbase.apache.org
>> Subject: Re: Slow Inserts on EC2 Cluster
>>
>> I'm starting to find that EC2 is not reliable enough to support HBase.
>> I'm running into 2 things that might be related:
>>
>> 1) On idle machines that are apparently doing nothing (reports of <3%
>> CPU utilization, no I/O wait)  the load is reported as being higher
>> than the number of cores.   I don't know if attachments work on the
>> mailing list, but I attached a small image anyway to illustrate this
>> confusing thing.  (I've been using m1.large and m2.xlarge running CDH3)
>>
>> 2) Every once in a while it seems that somebody hits the pause button
>> on one of my instances, and while the CPU utilization stays low, the
>> load value spikes to a high value.  When this happens the region
>> servers decide to close up shop.  It appears to be a problem with
>> contacting zookeeper servers (who happen to stay up and running, but
>> perhaps somewhat unresponsive when Amazon decides to hit the pause
>> button).  I have extended the timeout for contacting zookeeper servers,
>> but these events continue to persist.  One such event happened 8 hours
>> ago, and I still can't get HBase back up and running.
>>
>> I've seen many comments on this list informing users that they are
>> using hardware (or virtual machines) that are simply not big enough,
>> not fast enough, or don't have enough memory.  I'd like to offer an
>> alternative point of view.  Whether or not EC2 will last is uncertain,
>> but cloud computing environments will definitely be around for a long
>> time.  What would it take to make HBase resilient enough to take
>> advantage of those environments?  Based on my experience and comments
>> on this list, it seems "HBase in the cloud" is still a rather painful
>> proposition.
>>
>> -Matthew
>
>



-- 
Bradford Stephens,
Founder, Drawn to Scale
drawntoscalehq.com
727.697.7528

http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
solution. Process, store, query, search, and serve all your data.

http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science

Mime
View raw message