accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Newton (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-4012) FATE lock-up
Date Wed, 30 Sep 2015 00:25:04 GMT


Eric Newton commented on ACCUMULO-4012:

The original issue is that there was a race condition for the child node that was missing
(was just about to be written).  The error is that there's a race condition for the parent
node being cleaned up when we go to look at it.

> FATE lock-up
> ------------
>                 Key: ACCUMULO-4012
>                 URL:
>             Project: Accumulo
>          Issue Type: Bug
>          Components: master, tserver
>    Affects Versions: 1.5.3, 1.5.4, 1.6.0, 1.6.2, 1.6.3, 1.7.0
>         Environment: large production cluster
>            Reporter: Eric Newton
>            Assignee: Eric Newton
>             Fix For: 1.6.5, 1.7.1, 1.8.0
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
> On a large production cluster, some periodic data processing hangs on FATE transactions.
The basic operation is to bulk load the results of a map-reduce job into a temporary table,
which is then later deleted. Increasing the number of FATE threads has not improved the situation.
> The details are not clear, and unfortunately this system is not online, so I cannot reproduce
the logs easily, but they would be huge anyhow.

This message was sent by Atlassian JIRA

View raw message