hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From steven zhuang <steven.zhuang.1...@gmail.com>
Subject Re: how can I check the I/O influence HBase to HDFS
Date Wed, 07 Apr 2010 01:15:19 GMT
hi, Jonathan,
*
*
On Wed, Apr 7, 2010 at 6:15 AM, Jonathan Gray <jgray@facebook.com> wrote:

> Can you explain more about what information you are trying to find out?
>
> You had an existing HDFS and you want to measure the additional impact
> adding HBase is?  Is that in terms of reads/writes/iops or data size?
>
> *            I just want to get the additional I/O data size after adding
Hbase to Hadoop.*


> If you have a steady-state set of metrics for HDFS w/o HBase, can you not
> just monitor those metrics w/ HBase running and calculate the deltas?
>
> *those hbase apps are done by different people, so it's hard to track data
IO quantity. *

Also, to what end are you trying to figure this out?  I'm very much
> interested in what courses of actions you might take given the different
> information you could find out about HBase's influence on your cluster.
>
> I want to convince my leader that a larger RAM for the regionserver will
lower the IO rate, there should be less swapping, but I have to get the
comparison result first.

> JG
>
> > -----Original Message-----
> > From: steven zhuang [mailto:steven.zhuang.1984@gmail.com]
> > Sent: Tuesday, April 06, 2010 8:34 AM
> > To: hbase-user@hadoop.apache.org
> > Subject: how can I check the I/O influence HBase to HDFS
> >
> > hi, there,
> >               I have this problem of checking the influence HBase
> > brought to
> > HDFS.
> >               I have a Hadoop cluster which has 30+ data nodes, and a
> > Hbase
> > cluster based on it, with 18 regionservers residing on 18 datanodes.
> >               we have observed the HDFS IO has increased a lot if we do
> > some
> > importing or query ops on hbase tables, but we don't know how
> > much would hbase impact the HDFS, so now I have to dig into this.
> >               my idea is as follows:
> >
> >                  1.  grep from regionservers logs the file information
> > of
> > hbase tables, which mainly should be store files' names and their
> > sizes, sum
> > the size up.
> >                  2. grep from datanodes' logs the HDFS_READ/HDFS_WRITE
> > log,
> > and calculate the whole IO bytes.
> >                  3. get the rate of HBase IO / HDFS IO.
> >
> >                my concern is that if the above idea is right, is there
> > anything missing or a better way to do this?
> >
> >                And to make it more convinsible, I want to have the
> > block
> > info for each HTable's, not just those ones under each table's
> > directory,
> > but also those store files which was later removed by major compaction,
> > since in datanode log, all I can see is block id, any pointer or hint
> > is
> > really appreciated.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message