Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id C4B0F200CB0 for ; Fri, 9 Jun 2017 01:49:25 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id C349A160BE7; Thu, 8 Jun 2017 23:49:25 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 15A4E160BD5 for ; Fri, 9 Jun 2017 01:49:24 +0200 (CEST) Received: (qmail 48540 invoked by uid 500); 8 Jun 2017 23:49:24 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 48529 invoked by uid 99); 8 Jun 2017 23:49:24 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Jun 2017 23:49:24 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id B143F1A053D for ; Thu, 8 Jun 2017 23:49:23 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id AahPXntatKYP for ; Thu, 8 Jun 2017 23:49:22 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 951575FC1C for ; Thu, 8 Jun 2017 23:49:22 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id E36A0E0D63 for ; Thu, 8 Jun 2017 23:49:21 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id DDBD121E0F for ; Thu, 8 Jun 2017 23:49:19 +0000 (UTC) Date: Thu, 8 Jun 2017 23:49:19 +0000 (UTC) From: "Hudson (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-18027) Replication should respect RPC size limits when batching edits MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 08 Jun 2017 23:49:26 -0000 [ https://issues.apache.org/jira/browse/HBASE-18027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043654#comment-16043654 ] Hudson commented on HBASE-18027: -------------------------------- SUCCESS: Integrated in Jenkins build HBase-1.3-IT #60 (See [https://builds.apache.org/job/HBase-1.3-IT/60/]) HBASE-18027 HBaseInterClusterReplicationEndpoint should respect RPC (apurtell: rev 4227757335c3fe15ef1d7139140c795b414facf2) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALKey.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/HBaseInterClusterReplicationEndpoint.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java * (add) hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicator.java > Replication should respect RPC size limits when batching edits > -------------------------------------------------------------- > > Key: HBASE-18027 > URL: https://issues.apache.org/jira/browse/HBASE-18027 > Project: HBase > Issue Type: Bug > Components: Replication > Affects Versions: 2.0.0, 1.4.0, 1.3.1 > Reporter: Andrew Purtell > Assignee: Andrew Purtell > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18027-branch-1.patch, HBASE-18027-branch-1.patch, HBASE-18027-branch-1.patch, HBASE-18027-branch-1.patch, HBASE-18027.patch, HBASE-18027.patch, HBASE-18027.patch, HBASE-18027.patch, HBASE-18027.patch, HBASE-18027.patch, HBASE-18027.patch, HBASE-18027.patch > > > In HBaseInterClusterReplicationEndpoint#replicate we try to replicate in batches. We create N lists. N is the minimum of configured replicator threads, number of 100-waledit batches, or number of current sinks. Every pending entry in the replication context is then placed in order by hash of encoded region name into one of these N lists. Each of the N lists is then sent all at once in one replication RPC. We do not test if the sum of data in each N list will exceed RPC size limits. This code presumes each individual edit is reasonably small. Not checking for aggregate size while assembling the lists into RPCs is an oversight and can lead to replication failure when that assumption is violated. > We can fix this by generating as many replication RPC calls as we need to drain a list, keeping each RPC under limit, instead of assuming the whole list will fit in one. -- This message was sent by Atlassian JIRA (v6.3.15#6346)