hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11379) DFSInputStream may infinite loop requesting block locations
Date Fri, 27 Jan 2017 20:35:24 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15843457#comment-15843457
] 

Daryn Sharp commented on HDFS-11379:
------------------------------------

Found due to hive jobs colliding.  Tasks opened orc files, other tasks stomped on them, so
when the original tasks attempted to read the footer (outside the initial fetch range) it
went into an infinite loop requesting locations.  Issue was difficult to isolate because by
default the stream will fetch 10 blocks of locations so the issue only manifested for multi-GB
files.

> DFSInputStream may infinite loop requesting block locations
> -----------------------------------------------------------
>
>                 Key: HDFS-11379
>                 URL: https://issues.apache.org/jira/browse/HDFS-11379
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 2.7.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Critical
>
> DFSInputStream creation caches file size and initial range of locations.  If the file
is truncated (or replaced) and the client attempts to read outside the initial range, the
client goes into a tight infinite looping requesting locations for the nonexistent range.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message