hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-6752) On region server failure, serve writes and timeranged reads during the log split
Date Wed, 16 Nov 2016 22:18:59 GMT

     [ https://issues.apache.org/jira/browse/HBASE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

stack updated HBASE-6752:
    Affects Version/s:     (was: 0.95.2)

> On region server failure, serve writes and timeranged reads during the log split
> --------------------------------------------------------------------------------
>                 Key: HBASE-6752
>                 URL: https://issues.apache.org/jira/browse/HBASE-6752
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 2.0.0
>            Reporter: Nicolas Liochon
> Opening for write on failure would mean:
> - Assign the region to a new regionserver. It marks the region as recovering
>   -- specific exception returned to the client when we cannot server.
>   -- allow them to know where they stand. The exception can include some time information
(failure stated on: ...)
>   -- allow them to go immediately on the right regionserver, instead of retrying or calling
the region holding meta to get the new address
>      => save network calls, lower the load on meta.
> - Do the split as today. Priority is given to region server holding the new regions
>   -- help to share the load balancing code: the split is done by region server considered
as available for new regions
>   -- help locality (the recovered edits are available on the region server) => lower
the network usage
> - When the split is finished, we're done as of today
> - while the split is progressing, the region server can
>  -- serve writes
>    --- that's useful for all application that need to write but not read immediately:
>    --- whatever logs events to analyze them later
>    --- opentsdb is a perfect example.   
>  -- serve reads if they have a compatible time range. For heavily used tables, it could
be an help, because:
>    --- we can expect to have a few minutes of data only (as it's loaded)
>    --- the heaviest queries, often accepts a few -or more- minutes delay. 
> Some "What if":
> 1) the split fails
> => Retry until it works. As today. Just that we serves writes. We need to know (as
today) that the region has not recovered if we fail again.
> 2) the regionserver fails during the split
> => As 1 and as of today/
> 3) the regionserver fails after the split but before the state change to fully available.
> => New assign. More logs to split (the ones already dones and the new ones).
> 4) the assignment fails
> => Retry until it works. As today.

This message was sent by Atlassian JIRA

View raw message