hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11644) DFSStripedOutputStream should not implement Syncable
Date Wed, 12 Apr 2017 22:07:41 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15966723#comment-15966723
] 

Andrew Wang commented on HDFS-11644:
------------------------------------

Looking at this, it's not pretty. FileSystem returns an FSDataOutputStream, which implements
Syncable. Its implementation either does a real hflush, or just calls flush. See:

{code:title=FSDataOutputStream}
  @Override  // Syncable
  public void hflush() throws IOException {
    if (wrappedStream instanceof Syncable) {
      ((Syncable)wrappedStream).hflush();
    } else {
      wrappedStream.flush();
    }
  }
{code}

I don't understand how users can figure out if they're getting a real hflush. FSDataOutputStream
implements Syncable, so you can't query with {{instanceof}}. There's currently no public way
of querying the wrapped stream either. I think it was a mistake to add {{Syncable}} to FSDataOutputStream,
we should have forced users to check with {{instanceof}} and cast it.

I don't like changing DFSStripedOutputStream#hflush to simply call flush, since then HDFS
users who turn on EC will silently stop getting real hflush/hsync. The current behavior of
throwing an exception is safer.

[~stevel@apache.org], any thoughts on this? I notice that output streams aren't covered by
the FileSystem spec. This also relates to discussions about querying which features are supported
by a FS.

> DFSStripedOutputStream should not implement Syncable
> ----------------------------------------------------
>
>                 Key: HDFS-11644
>                 URL: https://issues.apache.org/jira/browse/HDFS-11644
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: erasure-coding
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Andrew Wang
>            Assignee: Manoj Govindassamy
>              Labels: hdfs-ec-3.0-must-do
>
> FSDataOutputStream#hsync checks if a stream implements Syncable, and if so, calls hsync.
Otherwise, it just calls flush. This is used, for instance, by YARN's FileSystemTimelineWriter.
> DFSStripedOutputStream extends DFSOutputStream, which implements Syncable. However, DFSStripedOS
throws a runtime exception when the Syncable methods are called.
> We should refactor the inheritance structure so DFSStripedOS does not implement Syncable.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message