Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0F096D406 for ; Mon, 10 Sep 2012 23:44:07 +0000 (UTC) Received: (qmail 84980 invoked by uid 500); 10 Sep 2012 23:44:07 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 84915 invoked by uid 500); 10 Sep 2012 23:44:07 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 84854 invoked by uid 99); 10 Sep 2012 23:44:07 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Sep 2012 23:44:07 +0000 Date: Tue, 11 Sep 2012 10:44:07 +1100 (NCT) From: "Jean-Daniel Cryans (JIRA)" To: issues@hbase.apache.org Message-ID: <2067039869.60909.1347320647623.JavaMail.jiratomcat@arcas> In-Reply-To: <1575774267.8828.1345772142556.JavaMail.jiratomcat@arcas> Subject: [jira] [Commented] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1] MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452581#comment-13452581 ] Jean-Daniel Cryans commented on HBASE-6649: ------------------------------------------- bq. This is because of multiple calls to reader.next within readAllEntriesToReplicateOrNextFile. If the second call (within the while loop) throws an exception (like EOFException), it basically destroys the work done up until then. Therefore, some rows would never be replicated. The position in the log is updated in ZK only once the edits are replicated hence, even if you fail on the second or hundredth edit, the next region server that will be in charge of that log will pick up where the previous RS was (even if that means re-reading some edits). > [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1] > --------------------------------------------------------------------------- > > Key: HBASE-6649 > URL: https://issues.apache.org/jira/browse/HBASE-6649 > Project: HBase > Issue Type: Bug > Reporter: Devaraj Das > Assignee: Devaraj Das > Fix For: 0.96.0, 0.92.3, 0.94.2 > > Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, 6649-trunk.patch, 6649-trunk.patch, 6649.txt, HBase-0.92 #495 test - queueFailover [Jenkins].html, HBase-0.92 #502 test - queueFailover [Jenkins].html > > > Have seen it twice in the recent past: http://bit.ly/MPCykB & http://bit.ly/O79Dq7 .. > Looking briefly at the logs hints at a pattern - in both the failed test instances, there was an RS crash while the test was running. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira