hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Some problems in one accident on my production cluster
Date Thu, 25 Feb 2016 02:05:24 GMT
bq. two regions were in transition

Can you pastebin related server logs w.r.t. these two regions so that we
can have more clue ?

For #2, please see http://hbase.apache.org/book.html#big.cluster.config

For #3, please see
http://hbase.apache.org/book.html#_running_multiple_workloads_on_a_single_cluster

On Wed, Feb 24, 2016 at 3:31 PM, Heng Chen <heng.chen.1986@gmail.com> wrote:

> The story is I run one MR job on my production cluster (0.98.6),   it needs
> to scan one table during map procedure.
>
> Because of the heavy load from the job,  all my RS crashed due to OOM.
>
> After i restart all RS,  i found one problem.
>
> All regions were reopened on one RS,  and balancer could not run because of
> two regions were in transition.   The cluster got in stuck a long time
> until i restarted master.
>
> 1.  why this happened?
>
> 2.  If cluster has a lots of regions, after all RS crash,  how to restart
> the cluster.  If restart RS one by one, it means OOM may happen because one
> RS has to hold all regions and it will cost a long time.
>
> 3.  Is it possible to make each table with some requests quotas,  it means
> when one table is requested heavily, it has no impact to other tables on
> cluster.
>
>
> Thanks
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message