hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Enis Söztutar <e...@apache.org>
Subject Re: Canary Test Tool and write sniffing
Date Mon, 06 Feb 2017 21:38:24 GMT
Open an issue?
Enis

On Mon, Feb 6, 2017 at 9:39 AM, Stack <stack@duboce.net> wrote:

> On Sun, Feb 5, 2017 at 2:25 AM, Lars George <lars.george@gmail.com> wrote:
>
> > The next example is wrong too, claiming to show 60 secs, while it
> > shows 600 secs (the default value as well).
> >
> > The question is still, what is a good value for intervals? Anyone here
> > that uses the Canary that would like to chime in?
> >
> >
> I was hanging out with a user where on a mid-sized cluster with Canary
> running with defaults, the regionserver carrying meta was 100% CPU because
> of all the requests from Canary doing repeated full-table Scans.
>
> 6 seconds is too short. Seems like a typo that should be 60seconds. It is
> not as though the Canary is going to do anything about it if it finds
> something wrong.
>
> S
>
>
>
>
> > On Sat, Feb 4, 2017 at 5:40 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> > > Brief search on HBASE-4393 didn't reveal why the interval was
> shortened.
> > >
> > > If you read the first paragraph of:
> > > http://hbase.apache.org/book.html#_run_canary_test_as_daemon_mode
> > >
> > > possibly the reasoning was that canary would exit upon seeing some
> error
> > > (the first time).
> > >
> > > BTW There was a mismatch in the description for this command: (5
> seconds
> > > vs. 50000 milliseconds)
> > >
> > > ${HBASE_HOME}/bin/hbase canary -daemon -interval 50000 -f false
> > >
> > >
> > > On Sat, Feb 4, 2017 at 8:21 AM, Lars George <lars.george@gmail.com>
> > wrote:
> > >
> > >> Oh right, Ted. An earlier patch attached to the JIRA had 60 secs, the
> > >> last one has 6 secs. Am I reading this right? It hands 6000 into the
> > >> Thread.sleep() call, which takes millisecs. So that makes 6 secs
> > >> between checks, which seems super short, no? I might just dull here.
> > >>
> > >> On Sat, Feb 4, 2017 at 5:00 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> > >> > For the default interval , if you were looking at:
> > >> >
> > >> >   private static final long DEFAULT_INTERVAL = 6000;
> > >> >
> > >> > The above was from:
> > >> >
> > >> >     HBASE-4393 Implement a canary monitoring program
> > >> >
> > >> > which was integrated on Tue Apr 24 07:20:16 2012
> > >> >
> > >> > FYI
> > >> >
> > >> > On Sat, Feb 4, 2017 at 4:06 AM, Lars George <lars.george@gmail.com>
> > >> wrote:
> > >> >
> > >> >> Also, the default interval used to be 60 secs, but is now 6 secs.
> > Does
> > >> >> that make sense? Seems awfully short for a default, assuming you
> have
> > >> >> many regions or servers.
> > >> >>
> > >> >> On Sat, Feb 4, 2017 at 11:54 AM, Lars George <
> lars.george@gmail.com>
> > >> >> wrote:
> > >> >> > Hi,
> > >> >> >
> > >> >> > Looking at the Canary tool, it tries to ensure that all canary
> test
> > >> >> > table regions are spread across all region servers. If that
is
> not
> > the
> > >> >> > case, it calls:
> > >> >> >
> > >> >> > if (numberOfCoveredServers < numberOfServers) {
> > >> >> >   admin.balancer();
> > >> >> > }
> > >> >> >
> > >> >> > I doubt this will help with the StochasticLoadBalancer, which
is
> > known
> > >> >> > to consider per-table balancing as one of many factors. In
> > practice,
> > >> >> > the SLB will most likely _not_ distribute the canary regions
> > >> >> > sufficiently, leaving gap in the check. Switching on the
> per-table
> > >> >> > option is discouraged against to let it do its thing.
> > >> >> >
> > >> >> > Just pointing it out for vetting.
> > >> >> >
> > >> >> > Lars
> > >> >>
> > >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message