accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Newton (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-4012) FATE lock-up
Date Wed, 30 Sep 2015 00:25:04 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14936142#comment-14936142
] 

Eric Newton commented on ACCUMULO-4012:
---------------------------------------

The original issue is that there was a race condition for the child node that was missing
(was just about to be written).  The error is that there's a race condition for the parent
node being cleaned up when we go to look at it.


> FATE lock-up
> ------------
>
>                 Key: ACCUMULO-4012
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4012
>             Project: Accumulo
>          Issue Type: Bug
>          Components: master, tserver
>    Affects Versions: 1.5.3, 1.5.4, 1.6.0, 1.6.2, 1.6.3, 1.7.0
>         Environment: large production cluster
>            Reporter: Eric Newton
>            Assignee: Eric Newton
>             Fix For: 1.6.5, 1.7.1, 1.8.0
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> On a large production cluster, some periodic data processing hangs on FATE transactions.
The basic operation is to bulk load the results of a map-reduce job into a temporary table,
which is then later deleted. Increasing the number of FATE threads has not improved the situation.
> The details are not clear, and unfortunately this system is not online, so I cannot reproduce
the logs easily, but they would be huge anyhow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message