accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-3273) majc holding up tablet unloads?
Date Mon, 03 Nov 2014 16:21:33 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14194684#comment-14194684
] 

Josh Elser commented on ACCUMULO-3273:
--------------------------------------

bq. I don't know what the behavior of the datanode is when it runs into one of these read-only,
possibly broken drives.

I know that datanodes are at least slightly cognizant of "volume" failures (one of the paths
specified for their data), but I'm not familiar with what the failure condition is defined
as. I'm not sure what your configuration looks like, but you could verify that {{dfs.datanode.failed.volumes.tolerated}}
in hdfs-site.xml is set to 0.

I assume that the unload code is just interrupting the MajC thread? Is it possible that when
we're sitting on HDFS that it's eating our interrupt?

> majc holding up tablet unloads?
> -------------------------------
>
>                 Key: ACCUMULO-3273
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3273
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>    Affects Versions: 1.6.0, 1.6.1
>            Reporter: Eric Newton
>            Assignee: Eric Newton
>            Priority: Minor
>             Fix For: 1.6.2, 1.7.0
>
>
> While testing ACCUMULO-3263 on a large cluster, the table being randomized would not
go offline.  Each of these servers was performing a major compaction of the tablets to be
offlined.  I thought that taking a tablet offline would abort the majc.  Need an IT to verify
this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message