hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject RE: HBase not scaling well
Date Fri, 29 Oct 2010 15:03:07 GMT

I'd actually take a step back and ask what Hari is trying to do?

Its difficult to figure out what the problem is when the OP says I've got code that works
on individual psuedo mode, but not in an actual cluster.
It would be nice to know version(s), configuration... 3 nodes... are they running ZK on the
same machines that they are running Region Servers... Are they swapping? 8GB of memory can
disappear quickly... 

Lots of questions...

> From: clehene@adobe.com
> To: user@hbase.apache.org
> Date: Fri, 29 Oct 2010 09:05:28 +0100
> Subject: Re: HBase not scaling well
> Hi Hari, 
> Could you do some realtime monitoring (htop, iptraf, iostat) and report the results?
Also you could add some timers to the map-reduce operations: measure average operations times
to figure out what's taking so long. 
> Cosmin
> On Oct 29, 2010, at 9:55 AM, Hari Shankar wrote:
> > Hi,
> > 
> >     We are currently doing a POC for HBase in our system. We have
> > written a bulk upload job to upload our data from a text file into
> > HBase. We are using a 3-node cluster, one master which also works as
> > slave (running as namenode, jobtracker, HMaster, datanode,
> > tasktracker, HQuorumpeer and  HRegionServer) and 2 slaves (datanode,
> > tasktracker, HQuorumpeer and  HRegionServer running). The problem is
> > that we are getting lower performance from distributed cluster than
> > what we were getting from single-node pseudo distributed node. The
> > upload is taking about 30  minutes on an individual machine, whereas
> > it is taking 2 hrs on the cluster. We have replication set to 3, so
> > all parts should ideally be available on all nodes, so we doubt if the
> > problem is network latency. scp of files between nodes gives a speed
> > of about 12 MB/s, which I believe should be good enough for this to
> > function. Please correct me if I am wrong here. The nodes are all 4
> > core machines with 8 GB RAM.  We are spawning 4 simultaneous map tasks
> > on each node, and the job does not have any reduce phase. Any help is
> > greatly appreciated.
> > 
> > Thanks,
> > Hari Shankar
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message