Return-Path: X-Original-To: apmail-accumulo-notifications-archive@minotaur.apache.org Delivered-To: apmail-accumulo-notifications-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B6A1EEA09 for ; Fri, 15 Feb 2013 18:17:14 +0000 (UTC) Received: (qmail 33313 invoked by uid 500); 15 Feb 2013 18:17:14 -0000 Delivered-To: apmail-accumulo-notifications-archive@accumulo.apache.org Received: (qmail 33257 invoked by uid 500); 15 Feb 2013 18:17:14 -0000 Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@apache.org Delivered-To: mailing list notifications@accumulo.apache.org Received: (qmail 33091 invoked by uid 99); 15 Feb 2013 18:17:14 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Feb 2013 18:17:14 +0000 Date: Fri, 15 Feb 2013 18:17:14 +0000 (UTC) From: "Eric Newton (JIRA)" To: notifications@accumulo.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ACCUMULO-1053) continuous ingest detected data loss MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ACCUMULO-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13579380#comment-13579380 ] Eric Newton commented on ACCUMULO-1053: --------------------------------------- It looks like leaseRecovery is an async operation. You request it, and sometime later, it finishes. From the javadoc: "Start the lease recovery of a file" "@return true if the file is already closed" HDFS unit tests do this: {noformat} while (!fs.recoveryLease(path)) { Thread.sleep(5000); } {noformat} I've updated my workspace with this approach to wait for the file to be closed. In my initial tests, this seems to provide the necessary wait for the commit of the last block to the file. Interestingly, HBase does not do this, but has a hard-coded one-second sleep after the recovery (I'm looking at 0.94.4). > continuous ingest detected data loss > ------------------------------------ > > Key: ACCUMULO-1053 > URL: https://issues.apache.org/jira/browse/ACCUMULO-1053 > Project: Accumulo > Issue Type: Bug > Components: test, tserver > Reporter: Eric Newton > Assignee: Eric Newton > Priority: Critical > Fix For: 1.5.0 > > > Now that we're logging directly HDFS, we added datanodes to the agitator. That is, we are now killing data nodes during ingest, and now we are losing data. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira