Return-Path: X-Original-To: apmail-accumulo-notifications-archive@minotaur.apache.org Delivered-To: apmail-accumulo-notifications-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1C48510270 for ; Thu, 13 Mar 2014 16:07:47 +0000 (UTC) Received: (qmail 96921 invoked by uid 500); 13 Mar 2014 16:07:46 -0000 Delivered-To: apmail-accumulo-notifications-archive@accumulo.apache.org Received: (qmail 96832 invoked by uid 500); 13 Mar 2014 16:07:45 -0000 Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@apache.org Delivered-To: mailing list notifications@accumulo.apache.org Received: (qmail 96803 invoked by uid 99); 13 Mar 2014 16:07:44 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Mar 2014 16:07:44 +0000 Date: Thu, 13 Mar 2014 16:07:44 +0000 (UTC) From: "Bill Havanki (JIRA)" To: notifications@accumulo.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (ACCUMULO-2466) Bulk randomwalk fails with bad key MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Bill Havanki created ACCUMULO-2466: -------------------------------------- Summary: Bulk randomwalk fails with bad key Key: ACCUMULO-2466 URL: https://issues.apache.org/jira/browse/ACCUMULO-2466 Project: Accumulo Issue Type: Bug Components: master, test Affects Versions: 1.4.4 Reporter: Bill Havanki Running bulk randomwalk against 1.4.5-SNAPSHOT, got this in verification: {noformat} Caused by: java.lang.Exception: Bad key at r00000 cf:000 [] 1394658887772 false 1 at org.apache.accumulo.server.test.randomwalk.bulk.Verify.visit(Verify.java:65) {noformat} Possible reasons: * ACCUMULO-2110, not backported to 1.4 or 1.5 * master agitation I see in the logs three internal errors from imports that failed due to the masters being restarted. The failure timing is around 5 seconds after the masters restart. Example: {noformat} 12 14:10:17,580 [bulk.BulkMinusOne] ERROR: org.apache.accumulo.core.client.AccumuloException: Intern al error processing waitForTableOperation org.apache.accumulo.core.client.AccumuloException: Internal error processing waitForTableOperation at org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperation sImpl.java:290) at org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperation sImpl.java:258) at org.apache.accumulo.core.client.admin.TableOperationsImpl.importDirectory(TableOperations Impl.java:947) at org.apache.accumulo.server.test.randomwalk.bulk.BulkPlusOne.bulkLoadLots(BulkPlusOne.java :99) at org.apache.accumulo.server.test.randomwalk.bulk.BulkMinusOne.runLater(BulkMinusOne.java:2 9) ... Caused by: org.apache.thrift.TApplicationException: Internal error processing waitForTableOperation {noformat} Two BulkMinusOne and one BulkPlusOne failed, which may be why the offending row was at value 1. The {{TableOperationsImpl.waitForTableOperation}} method does not catch {{TApplicationException}}, so the imports fail. I see lots of previous work on this sort of error in ACCUMULO-334 and ACCUMULO-2110. If anyone has troubleshooting tips I'd be happy to hear them. -- This message was sent by Atlassian JIRA (v6.2#6252)