accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Newton (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-2645) tablet stuck unloading
Date Wed, 09 Apr 2014 16:04:15 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964315#comment-13964315
] 

Eric Newton commented on ACCUMULO-2645:
---------------------------------------

The unloader attempts to interrupt the scans.  Current theory is that this interrupt is being
caught by the HDFS library, which indirectly causes the request to the NN to hang forever.
 I am writing a test for this theory.



> tablet stuck unloading
> ----------------------
>
>                 Key: ACCUMULO-2645
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-2645
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>    Affects Versions: 1.4.4
>         Environment: very large production cluster, CDH3u5
>            Reporter: Eric Newton
>            Assignee: Eric Newton
>              Labels: newbie
>             Fix For: 1.7.0
>
>
>  * master failed to balance
>  * custom balancer refused to balance while migrations were in place
>  * tablet server was not unloading the tablet
>  * tablet server was otherwise serving tablets, providing status
>  * memory dump determined that there were 21K UnloadTabletHandler objects
>  * jstack showed UnloadTabletHandler in Tablet.completeClose, line 2674
>  * the last print of the debug "completeClose(safeState=true, completeClose=true) occured
9 days ago
>  * there was a query that had been running for 9 days



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message