hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shuaishuai Nie (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-4773) Templeton intermittently fail to commit output to file system
Date Wed, 04 Sep 2013 01:36:51 GMT

    [ https://issues.apache.org/jira/browse/HIVE-4773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13757365#comment-13757365
] 

Shuaishuai Nie commented on HIVE-4773:
--------------------------------------

The problem seems not exclusive for asv. According to "Hadoop the definitive guide" 3nd edition
P75, "HDFS trades off some POSIX requirements for performance, so some operations
may behave differently than you expect them to." "any content written to the file is not guaranteed
to be visible, even if the stream is flushed".
Not sure if this will break Yarn if it does container reuse. One safer way is to use "FSDataOutputStream"
instead of "PrintWriter" which implement function sync() to ensure data written up to that
point in the file is visible to user in HDFS.
                
> Templeton intermittently fail to commit output to file system
> -------------------------------------------------------------
>
>                 Key: HIVE-4773
>                 URL: https://issues.apache.org/jira/browse/HIVE-4773
>             Project: Hive
>          Issue Type: Bug
>          Components: WebHCat
>            Reporter: Shuaishuai Nie
>            Assignee: Shuaishuai Nie
>         Attachments: HIVE-4773.1.patch
>
>
> With ASV as a default FS, we saw instances where output is not fully flushed to storage
before the Templeton controller process exits. This results in stdout and stderr being empty
even though the job completed successfully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message