hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Louis Hust <louis.h...@gmail.com>
Subject Re: How to monitor and control heavy network traffic for region server
Date Thu, 11 Jun 2015 04:25:19 GMT
Hi,Vladimir Rodionov

Thanks for the reply.

We encounter a problem is that:
we found there are some
SCAN operations (with startkey and endkey and filter) last for more than 1
hours, which lead to heavy network
traffic, because *some data is not **stored at local data node and the
region is very big*, about 100G-500G.
So much data transfer from remote datanode to one regionserver which handle
the SCAN opertion.
*But the result return to client is small.*

So the problem is scan much rows and return small rows with filter.

What we want to do is to monitor the SCAN operations which have executed
for a long time. So that
we can know the cluster healthy ASAP.

For now, I have found one method to monitor the process list of region
server:

visit:
http://regionserverhost:60030/rs-status?format=json&filter=all

this will display all thread status, so we can take advance of this json
metrics.


2015-06-11 11:17 GMT+08:00 Vladimir Rodionov <vladrodionov@gmail.com>:

> Louis,
>
> What do you mean by "monitor the long scan"? If you need to throttle
> network IO during scan, you have to
> do it on a client side. Take a look at
> org.apache.hadoop.hbase.io.hadoopbackport.ThrottledInputStream
> as an example, something similar you will need to implement on top of
> ResultScanner - ThrottledResultScanner.
>
> Good idea for improvement, actually.
>
> -Vlad
>
> On Wed, Jun 10, 2015 at 7:30 PM, Louis Hust <louis.hust@gmail.com> wrote:
>
> > hi, Dave,
> >
> > For now we will not upgrade the version, so if there is something we can
> > monitor the long scan for 0.96?
> >
> > 2015-06-11 2:00 GMT+08:00 Dave Latham <latham@davelink.net>:
> >
> > > I'm not aware of anything in version 0.96 that will limit the scan for
> > > you - you may have to do it in your client yourself.  If you're
> > > willing to upgrade, do check out the throttling available in HBase
> > > 1.1:
> > >
> >
> https://blogs.apache.org/hbase/entry/the_hbase_request_throttling_feature
> > >
> > >
> > >
> > > On Wed, Jun 10, 2015 at 4:15 AM, 娄帅 <louis.hust.ml@gmail.com> wrote:
> > > > any ideas?
> > > >
> > > > 2015-06-10 16:01 GMT+08:00 娄帅 <louis.hust.ml@gmail.com>:
> > > >
> > > >> Hi all,
> > > >>
> > > >> We are using HBASE 0.96 with Hadoop 2.2.0, recently we found there
> are
> > > some
> > > >> SCAN operations last for more than 1 hours, which lead to heavy
> > network
> > > >> traffic, because some data is not
> > > >> stored at local data node and the region is very big, about
> 100G-500G。
> > > >>
> > > >> With heavy network traffic, the region server then can not offer
> > service
> > > >> cause the NIC is full.
> > > >>
> > > >> From DBA view point, I want to know if there is some operation for
> > HBase
> > > >> to limit the large scan or if there is some flow control for SCAN
> > > operation
> > > >> and how can we monitor the long SCAN operation。
> > > >>
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message