hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Virag Kothari (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11659) Region state RPC call is not idempotent
Date Mon, 04 Aug 2014 18:34:12 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14085050#comment-14085050

Virag Kothari commented on HBASE-11659:

I dont see any exception on master when the first OPEN call timed out. There was only socket
timeout on regionserver.
I saw similar issue during CLOSE where master had already moved the region state to OFFLINE,
but the call had timeout on region server. On next retry, the master complained that the region
is not pending close, but it didn't lead in any inconsistency on the region server as the
region was already closed.
bq. We need to make sure the region is OPEN on the right server with the right open seq number
to make sure it is a retry.

Do we need to query meta to fetch the seq number?

> Region state RPC call is not idempotent
> ---------------------------------------
>                 Key: HBASE-11659
>                 URL: https://issues.apache.org/jira/browse/HBASE-11659
>             Project: HBase
>          Issue Type: Bug
>          Components: Region Assignment
>            Reporter: Virag Kothari
>            Assignee: Virag Kothari
>         Attachments: HBASE-11659.patch
> Here is the scenario on 0.98 with zk-less assignment
> The master gets an OPEN RPC call from region server.
> So, it moves the region state from PENDING_OPEN to OPEN.
> But, the call timeouts on the region server and region server retries sending the OPEN
call. However, now the master throws an Exception saying the region is not PENDING_OPEN. So,
the region servers aborts the region on receiving that exception and sends FAILED_OPEN to
master. But the master cannot change its state from FAILED_OPEN to OPEN, so eventually the
master keeps the state as OPEN while the actual region is no longer open on region server.
> The master should not throw an exception on receiving OPEN RPC calls multiple times.

This message was sent by Atlassian JIRA

View raw message