hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1964) Add internal status monitoring to RegionServer
Date Tue, 10 Nov 2009 14:29:27 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12775432#action_12775432
] 

Andrew Purtell commented on HBASE-1964:
---------------------------------------

bq. When a hadoop/hbase cluster is under heavy load it will inevitably reach a tipping point
where data is lost or corrupted

We take exception to this statement. One can corrupt an Oracle database by overcommitting
RAM such that the kernel panics in get_free_page (on Linux). 

bq. A graceful method is needed to put the cluster into safe mode until more resources can
be added or the load on the cluster has been reduced. 

There is no substitute for competent monitoring and administration of production systems,
especially ones which try to support terascale or petascale storage and computation over 10s
or 100s of servers. However, certainly it is the case that HBase has opportunities to sense
overloading and take self preserving actions where currently it does not.

> Add internal status monitoring to RegionServer
> ----------------------------------------------
>
>                 Key: HBASE-1964
>                 URL: https://issues.apache.org/jira/browse/HBASE-1964
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 0.20.1
>            Reporter: elsif 
>
> When a hadoop/hbase cluster is under heavy load it will inevitably reach a tipping point
where data is lost or corrupted.  A
> graceful method is needed to put the cluster into safe mode until more resources can
be added or the load on the cluster has been
> reduced.  
> St.Ack has suggested the following short-term task: "Meantime, it should be possible
to have a cron run a script that checks
> cluster resources from time-to-time -- e.g. how full hdfs is, how much each regionserver
is carrying -- and when it determines the needle is in the red,
> flip the cluster to be read-only."

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message