hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 冯宏华 <fenghong...@xiaomi.com>
Subject 答复: Dropping a very large table
Date Tue, 10 Sep 2013 04:22:07 GMT
seems no very simple way to do this. not sure if close/unassign regions gradually via script
before dropping can help a little.

the pain derives from current master assignment design which relies on ZK to track the assign/split
progress/status, and for creating/dropping/restarting tables with very big number of regions
the ZK can be overwhelmed by very heavy creation/update/deletion operations at almost the
same time. 

I wonder this is a kind of abuse of ZK in that by design ZK is expected to store small amount
of meta/config data with with sparse access, not to store such huge(if region number reach
20K-100K) amount of data/nodes with intensive access.

Why not store the assignment progress/status info in another system table, as META table,
rather than in ZK?
________________________________________
发件人: Michael Webster [michael.webster@bronto.com]
发送时间: 2013年9月10日 7:36
收件人: user@hbase.apache.org
主题: Dropping a very large table

Hello,

I have a very large HBase table running on 0.90, large meaning >20K regions
with a max region size of 1GB. This table is legacy and can be dropped, but
 we aren't sure what impact disabling/dropping that large of a table will
have on our cluster.

We are using dropAsync and polling HTable#isEnabled instead of the standard
shell disable command to avoid a timeout during disable like in
https://issues.apache.org/jira/browse/HBASE-3432.
Is there any risk to overwhelming zookeeper or the master with region
closed events during the disable, or would it be comparable to what happens
during a cluster restart when RS closes out regions?  Additionally, are
there any concerns with wiping out that much data in HDFS at once during
the drop?

Thank you in advance,
Michael
--
Mime
View raw message