From issues-return-62287-apmail-hbase-issues-archive=hbase.apache.org@hbase.apache.org Thu Sep 20 18:28:08 2012 Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6144BD1D8 for ; Thu, 20 Sep 2012 18:28:08 +0000 (UTC) Received: (qmail 30984 invoked by uid 500); 20 Sep 2012 18:28:08 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 30937 invoked by uid 500); 20 Sep 2012 18:28:08 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 30926 invoked by uid 99); 20 Sep 2012 18:28:08 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 Sep 2012 18:28:08 +0000 Date: Fri, 21 Sep 2012 05:28:07 +1100 (NCT) From: "Himanshu Vashishtha (JIRA)" To: issues@hbase.apache.org Message-ID: <707366756.104088.1348165688045.JavaMail.jiratomcat@arcas> In-Reply-To: <299739931.103723.1348161967879.JavaMail.jiratomcat@arcas> Subject: [jira] [Commented] (HBASE-6847) HBASE-6649 broke replication MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459831#comment-13459831 ] Himanshu Vashishtha commented on HBASE-6847: -------------------------------------------- I wonder about the IndexOutOfBound exception. Is this hlog part of a failover regionserver? > HBASE-6649 broke replication > ---------------------------- > > Key: HBASE-6847 > URL: https://issues.apache.org/jira/browse/HBASE-6847 > Project: HBase > Issue Type: Bug > Reporter: Jean-Daniel Cryans > Assignee: Devaraj Das > Priority: Blocker > Fix For: 0.96.0, 0.92.3, 0.94.2 > > Attachments: HBASE-6847-0.94.patch, HBASE-6847.patch > > > After running with HBASE-6646 and replication enabled I encountered this: > {noformat} > 2012-09-17 20:04:08,111 DEBUG org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening log for replication va1r3s24%2C10304%2C1347911704238.1347911706318 at 78617132 > 2012-09-17 20:04:08,120 DEBUG org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Break on IOE: hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318, entryStart=78641557, pos=78771200, end=78771200, edit=84 > 2012-09-17 20:04:08,120 DEBUG org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: currentNbOperations:164529 and seenEntries:84 and size: 154068 > 2012-09-17 20:04:08,120 DEBUG org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Replicating 84 > 2012-09-17 20:04:08,146 INFO org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: Going to report log #va1r3s24%2C10304%2C1347911704238.1347911706318 for position 78771200 in hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318 > 2012-09-17 20:04:08,158 INFO org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: Removing 0 logs in the list: [] > 2012-09-17 20:04:08,158 DEBUG org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Replicated in total: 93234 > 2012-09-17 20:04:08,158 DEBUG org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening log for replication va1r3s24%2C10304%2C1347911704238.1347911706318 at 78771200 > 2012-09-17 20:04:08,163 ERROR org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Unexpected exception in ReplicationSource, currentPath=hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318 > java.lang.IndexOutOfBoundsException > at java.io.DataInputStream.readFully(DataInputStream.java:175) > at org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63) > at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101) > at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2001) > at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1901) > at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1947) > at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:235) > at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:394) > at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:307) > {noformat} > There's something weird at the end of the file and it's killing replication. We used to just retry. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira