hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HDFS-24) FSDataOutputStream should flush last partial CRC chunk
Date Thu, 17 Jul 2014 21:18:06 GMT

     [ https://issues.apache.org/jira/browse/HDFS-24?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Allen Wittenauer resolved HDFS-24.

    Resolution: Fixed

I'd be greatly surprised if this wasn't fixed by now.

> FSDataOutputStream should flush last partial CRC chunk
> ------------------------------------------------------
>                 Key: HDFS-24
>                 URL: https://issues.apache.org/jira/browse/HDFS-24
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: dhruba borthakur
> The FSDataOutputSteam.flush() api is supposed to flush all data to the underlying stream.
However, for LocalFileSystem, the flush APi does not flush the last partial CRC chunk.
> One solution is described in HADOOP-2657: We should change FSOutputStream to implement
Seekable, and have the default implementation of seek throw an IOException, then use this
in CheckSumFileSystem to rewind and overwrite the checksum. Then folks will only fail if they
attempt to write more data after they've flushed on a ChecksumFileSystem that doesn't support
seek. I don't think we will have any filesystems that both extend CheckSumFileSystem and can't
support seek. Only LocalFileSystem currently extends CheckSumFileSystem, and it does support
seek. So flush() shouldn't ever fail for existing FileSystem's, but seek() will fail for most
output streams (probably all except local).

This message was sent by Atlassian JIRA

View raw message