accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bill Havanki (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-2466) Bulk randomwalk fails with bad key
Date Thu, 13 Mar 2014 19:09:45 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933906#comment-13933906
] 

Bill Havanki commented on ACCUMULO-2466:
----------------------------------------

How could a shutdown hook be doing something when the master is just coming up?

I doubt I'm capable of doing that metadata trace stuff - I still barely understand any details
about the METADATA table, and never mind decoding WALs. :)

> Bulk randomwalk fails with bad key
> ----------------------------------
>
>                 Key: ACCUMULO-2466
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-2466
>             Project: Accumulo
>          Issue Type: Bug
>          Components: master, test
>    Affects Versions: 1.4.4
>            Reporter: Bill Havanki
>              Labels: import, randomwalk, test
>
> Running bulk randomwalk against 1.4.5-SNAPSHOT, got this in verification:
> {noformat}
> Caused by: java.lang.Exception: Bad key at r00000 cf:000 [] 1394658887772 false 1
>         at org.apache.accumulo.server.test.randomwalk.bulk.Verify.visit(Verify.java:65)
> {noformat}
> Possible reasons:
> * ACCUMULO-2110, not backported to 1.4 or 1.5
> * master agitation
> I see in the logs three internal errors from imports that failed due to the masters being
restarted. The failure timing is around 5 seconds after the masters restart. Example:
> {noformat}
> 12 14:10:17,580 [bulk.BulkMinusOne] ERROR: org.apache.accumulo.core.client.AccumuloException:
Intern
> al error processing waitForTableOperation
> org.apache.accumulo.core.client.AccumuloException: Internal error processing waitForTableOperation
>         at org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperation
> sImpl.java:290)
>         at org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperation
> sImpl.java:258)
>         at org.apache.accumulo.core.client.admin.TableOperationsImpl.importDirectory(TableOperations
> Impl.java:947)
>         at org.apache.accumulo.server.test.randomwalk.bulk.BulkPlusOne.bulkLoadLots(BulkPlusOne.java
> :99)
>         at org.apache.accumulo.server.test.randomwalk.bulk.BulkMinusOne.runLater(BulkMinusOne.java:2
> 9)
> ...
> Caused by: org.apache.thrift.TApplicationException: Internal error processing waitForTableOperation
> {noformat}
> Two BulkMinusOne and one BulkPlusOne failed, which may be why the offending row was at
value 1.
> The {{TableOperationsImpl.waitForTableOperation}} method does not catch {{TApplicationException}},
so the imports fail.
> I see lots of previous work on this sort of error in ACCUMULO-334 and ACCUMULO-2110.
If anyone has troubleshooting tips I'd be happy to hear them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message