hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashu Pachauri (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-16752) Upgrading from 1.2 to 1.3 can lead to replication failures due to difference in RPC size limit
Date Wed, 12 Oct 2016 08:57:20 GMT

     [ https://issues.apache.org/jira/browse/HBASE-16752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ashu Pachauri updated HBASE-16752:
----------------------------------
    Attachment: HBASE-16752.V1.patch

V1:
After discussing offline with [~ghelmling] , it seems that both approaches I outlined break
abstraction and are not desirable. After looking little deeper into the code, we see that
CodedInputStream supports reading from an InputStream too. So, this patch constructs a wrapper
InputStream for the underlying non-blocking channel for the purpose of reading just the RequestHeader.
This gives us access to the call ID which means that the server can return the exception successfully
to the client. 

This also modifies the test case which was incorrectly testing the size limit (100 bytes is
too low, as connection header exceeds 100 bytes).

> Upgrading from 1.2 to 1.3 can lead to replication failures due to difference in RPC size
limit
> ----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-16752
>                 URL: https://issues.apache.org/jira/browse/HBASE-16752
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication, rpc
>    Affects Versions: 2.0.0, 1.3.0
>            Reporter: Ashu Pachauri
>            Assignee: Ashu Pachauri
>         Attachments: HBASE-16752.V1.patch
>
>
> In HBase 1.2, we don't limit size of a single RPC but in 1.3 we limit it by default to
256 MB.  This means that during upgrade scenarios (or when source is 1.2 peer is already on
1.3), it's possible to encounter a situation where we try to send an rpc with size greater
than 256 MB because we never unroll a WALEdit while sending replication traffic.
> RpcServer throws the underlying exception locally, but closes the connection with returning
the underlying error to the client, and client only sees a "Broken pipe" error.
> I am not sure what is the proper fix here (or if one is needed) to make sure this does
not happen, but we should return the underlying exception to the RpcClient, because without
it, it can be difficult to diagnose the problem, especially for someone new to HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message