hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei-Chiu Chuang (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-11711) DN should not delete the block On "Too many open files" Exception
Date Wed, 07 Jun 2017 18:38:18 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041353#comment-16041353
] 

Wei-Chiu Chuang edited comment on HDFS-11711 at 6/7/17 6:37 PM:
----------------------------------------------------------------

[~brahmareddy] sorry i didn't make myself clear.
To begin with, this behavior was caused by HDFS-8492, which throws FileNotFoundException("BlockId
" + blockId + " is not valid.").
I was just thinking that "Too many open files" error is thrown within Java library, so there's
no guarantee this would be compatible between different operating systems, or across different
Java versions, or different JVM/JDK implementation.

IMHO, the more compatible approach would be that we check if FNFE has "BlockId " + blockId
+ " is not valid.", and only delete the block when that's the case.

Edit: HDFS-3100 throws FileNotFoundException("Meta-data not found for " + block) when meta
file checksum is not found. So this should be checked as well.

Or, it should just throw a new type of exception in these two cases.


was (Author: jojochuang):
[~brahmareddy] sorry i didn't make myself clear.
To begin with, this behavior was caused by HDFS-8492, which throws FileNotFoundException("BlockId
" + blockId + " is not valid.").
I was just thinking that "Too many open files" error is thrown within Java library, so there's
no guarantee this would be compatible between different operating systems, or across different
Java versions, or different JVM/JDK implementation.

IMHO, the more compatible approach would be that we check if FNFE has "BlockId " + blockId
+ " is not valid.", and only delete the block when that's the case.

Edit: HDFS-3100 throws FileNotFoundException("Meta-data not found for " + block) when meta
file checksum is not found. So this should be checked as well.

> DN should not delete the block On "Too many open files" Exception
> -----------------------------------------------------------------
>
>                 Key: HDFS-11711
>                 URL: https://issues.apache.org/jira/browse/HDFS-11711
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>            Reporter: Brahma Reddy Battula
>            Assignee: Brahma Reddy Battula
>            Priority: Critical
>             Fix For: 2.9.0, 3.0.0-alpha4, 2.8.2
>
>         Attachments: HDFS-11711-002.patch, HDFS-11711-003.patch, HDFS-11711-004.patch,
HDFS-11711-branch-2-002.patch, HDFS-11711-branch-2-003.patch, HDFS-11711.patch
>
>
>  *Seen the following scenario in one of our customer environment* 
> * while jobclient writing {{"job.xml"}} there are pipeline failures and written to only
one DN.
> * when mapper reading the {{"job.xml"}}, DN got {{"Too many open files"}} (as system
exceed limit) and block got deleted. Hence mapper failed to read and job got failed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message