hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18027) HBaseInterClusterReplicationEndpoint should respect RPC size limits when batching edits
Date Sun, 14 May 2017 01:50:05 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009567#comment-16009567

Andrew Purtell commented on HBASE-18027:

[~lhofhansl] The problems we are facing in production are of the nature "Replication failure
across many superpod hosts with error "Rpc data length XXXXXXXXX exceeded limit YYYYYYYYYY,
set hbase.ipc.max.request.size on server to override this limit(not recommended)"". Its not
clear how we are getting to such huge RPCs in the first place. Obviously some WALEdits are
very large. Phoenix is in the mix. I noticed this code in Replicator does not split the worklist
up if the sum is larger than the RPC limit. This is with 0.98 (and patched, too), which is
dead code. Perhaps we table this change and wait and see if the same problems occur with 1.3?

> HBaseInterClusterReplicationEndpoint should respect RPC size limits when batching edits
> ---------------------------------------------------------------------------------------
>                 Key: HBASE-18027
>                 URL: https://issues.apache.org/jira/browse/HBASE-18027
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 2.0.0, 1.4.0, 1.3.1
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>             Fix For: 2.0.0, 1.4.0, 1.3.2
>         Attachments: HBASE-18027-branch-1.patch, HBASE-18027-branch-1.patch, HBASE-18027.patch,
HBASE-18027.patch, HBASE-18027.patch
> In HBaseInterClusterReplicationEndpoint#replicate we try to replicate in batches. We
create N lists. N is the minimum of configured replicator threads, number of 100-waledit batches,
or number of current sinks. Every pending entry in the replication context is then placed
in order by hash of encoded region name into one of these N lists. Each of the N lists is
then sent all at once in one replication RPC. We do not test if the sum of data in each N
list will exceed RPC size limits. This code presumes each individual edit is reasonably small.
Not checking for aggregate size while assembling the lists into RPCs is an oversight and can
lead to replication failure when that assumption is violated.
> We can fix this by generating as many replication RPC calls as we need to drain a list,
keeping each RPC under limit, instead of assuming the whole list will fit in one.

This message was sent by Atlassian JIRA

View raw message