Return-Path: X-Original-To: apmail-accumulo-notifications-archive@minotaur.apache.org Delivered-To: apmail-accumulo-notifications-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8CEBD1010E for ; Mon, 20 Jan 2014 21:01:22 +0000 (UTC) Received: (qmail 21710 invoked by uid 500); 20 Jan 2014 21:01:20 -0000 Delivered-To: apmail-accumulo-notifications-archive@accumulo.apache.org Received: (qmail 21679 invoked by uid 500); 20 Jan 2014 21:01:19 -0000 Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@apache.org Delivered-To: mailing list notifications@accumulo.apache.org Received: (qmail 21669 invoked by uid 99); 20 Jan 2014 21:01:19 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Jan 2014 21:01:19 +0000 Date: Mon, 20 Jan 2014 21:01:19 +0000 (UTC) From: "Bill Havanki (JIRA)" To: notifications@accumulo.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (ACCUMULO-2227) Concurrent randomwalk fails when namenode dies after bulk import step MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Bill Havanki created ACCUMULO-2227: -------------------------------------- Summary: Concurrent randomwalk fails when namenode dies after bulk import step Key: ACCUMULO-2227 URL: https://issues.apache.org/jira/browse/ACCUMULO-2227 Project: Accumulo Issue Type: Bug Components: test Affects Versions: 1.4.4 Reporter: Bill Havanki Running Concurrent randomwalk under HDFS HA, if the active namenode is killed: {noformat} 20 12:27:51,119 [retry.RetryInvocationHandler] WARN : Exception while invoking class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete. Not retrying because the invoked method is not idempotent, and unable to determine whether it was invoked java.io.IOException: Failed on local exception: java.io.IOException: Response is null.; Host Details : local host is: "slave.domain.com/10.20.200.113"; destination host is: "namenode.domain.com":8020; ... at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:1487) at org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:355) at org.apache.accumulo.server.test.randomwalk.concurrent.BulkImport.visit(BulkImport.java:140) ... Caused by: java.io.IOException: Response is null. at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:952) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:847) {noformat} This arises from an HDFS path delete call that cleans up from the bulk import. The test should be resilient here (and when the paths are made earlier in the test) so that the test can continue once failover has completed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)