hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8338) Latency Resilience; umbrella list of issues that will help us ride over bad disk, bad region, ec2, etc.
Date Tue, 30 Apr 2013 18:10:17 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13645796#comment-13645796
] 

stack commented on HBASE-8338:
------------------------------

HBASE-6295 helps a bunch but we still need more along this direction; if a region is totally
dead we'll ultimately block all in/out if we accumulate data for this region that makes us
hit global limit.  We need a way to exclude bad regions completely so never blocks access
to all other regions.
                
> Latency Resilience; umbrella list of issues that will help us ride over bad disk, bad
region, ec2, etc.
> -------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-8338
>                 URL: https://issues.apache.org/jira/browse/HBASE-8338
>             Project: HBase
>          Issue Type: Umbrella
>          Components: LatencyResilience
>            Reporter: stack
>            Priority: Critical
>
> Chatting w/ Elliott, we started listing out items to fix that would help keep hbase latency
approximately constant as disks went bad, were saturated by a neighbour (ec2), etc.
> I must made a new LatencyResilience issue category to tag issues that contribute to this
project.
> I have to go at moment but when I get back I'll start to link in existing issues that
help this project along and I'll file new ones.
> Here is what we chatted about:
> + Multiple WALs effort will help keep write latency roughly constant.
> + Figuring how to get a new read started over dfsclient if current replica read is taking
too long would help keep reads about constant (maybe could exploit the nkeywal hackery messing
w/ replicas order).
> + There is an issue where client can currently pile up on a single region because of
the way we do client queues by regionserver.  This needs fixing.
> The above are few ideas worth further exploration at least.
> Idea is to try and bring down our 95percentiles and to make us more robust in the face
of dying disks, etc.  I see this issue rising to the fore now there has been good progress
on the MTTR project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message