hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Weiwei Yang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12328) Ozone: Purge metadata of deleted blocks after max retry times
Date Tue, 12 Sep 2017 02:28:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16162393#comment-16162393
] 

Weiwei Yang commented on HDFS-12328:
------------------------------------

Hi [~yuanbo]

Thanks for updating the description, I noticed that you proposed to introduce a new sub command
for scm "-txid", well I am not in favor of this. The reason is the TXs are internal notions,
we don't need to expose this to end user. When a block cannot be deleted after max time of
retries, we consider this block is *corrupted*, from user level, I think we need a *block*
level command in SCM. Some initial thoughts

{code}
// list all corrupted block IDs
hdfs scm -block -list --corrupted

// get detail info of this block as much as possible, where the data locates
// so admin can logon to certain datanode to debug why deletion was failed
hdfs scm -block -info xxx

// delete a certain block
hdfs scm -block -delete xxx

// delete all corrupted blocks
// this will need extra confirmation from keyboard by user
hdfs scm -block -delete --corrupted 
{code}

I have set the priority to major, because I don't think this is a super important feature
that must be addressed now (lets get this done as a post merge task). At present, we have
alternative to leverage SQLCli to dump DB info to debug. Also like [~linyiqun] commented,
it might be good to start with adding corrupted blocks in SCM JMX which is a smaller task
and that can help us understand how big the problem is here.

Thanks

> Ozone: Purge metadata of deleted blocks after max retry times
> -------------------------------------------------------------
>
>                 Key: HDFS-12328
>                 URL: https://issues.apache.org/jira/browse/HDFS-12328
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Yuanbo Liu
>            Assignee: Yuanbo Liu
>              Labels: OzonePostMerge
>
> In HDFS-12283, we set the value of count to -1 if blocks cannot be deleted after max
retry times. We need to provide APIs for admins to purge the "-1" metadata manually. Implement
these commands:
> list the txids
> {code}
> hdfs scm -txid list -count<number> -retry <number>
> {code}
> delete the txid
> {code}
> hdfs scm -txid delete -id <txid>
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message