hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei-Chiu Chuang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11711) DN should not delete the block On "Too many open files" Exception
Date Wed, 07 Jun 2017 18:11:18 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041353#comment-16041353

Wei-Chiu Chuang commented on HDFS-11711:

[~brahmareddy] sorry i didn't make myself clear.
To begin with, this behavior was caused by HDFS-8492, which throws FileNotFoundException("BlockId
" + blockId + " is not valid.").
I was just thinking that "Too many open files" error is thrown within Java library, so there's
no guarantee this would be compatible between different operating systems, or across different
Java versions, or different JVM/JDK implementation.

IMHO, the more compatible approach would be that we check if FNFE has "BlockId " + blockId
+ " is not valid.", and only delete the block when that's the case.

> DN should not delete the block On "Too many open files" Exception
> -----------------------------------------------------------------
>                 Key: HDFS-11711
>                 URL: https://issues.apache.org/jira/browse/HDFS-11711
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>            Reporter: Brahma Reddy Battula
>            Assignee: Brahma Reddy Battula
>            Priority: Critical
>             Fix For: 2.9.0, 3.0.0-alpha4, 2.8.2
>         Attachments: HDFS-11711-002.patch, HDFS-11711-003.patch, HDFS-11711-004.patch,
HDFS-11711-branch-2-002.patch, HDFS-11711-branch-2-003.patch, HDFS-11711.patch
>  *Seen the following scenario in one of our customer environment* 
> * while jobclient writing {{"job.xml"}} there are pipeline failures and written to only
one DN.
> * when mapper reading the {{"job.xml"}}, DN got {{"Too many open files"}} (as system
exceed limit) and block got deleted. Hence mapper failed to read and job got failed.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message