reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rogan Carr <rogan.c...@gmail.com>
Subject Re: [REEF-1892] HDFS File Copy only uses local HDFS
Date Sun, 24 Sep 2017 14:30:54 GMT
And here is the PR for REEF-1827: https://github.com/apache/reef/pull/1331

On Sat, Sep 23, 2017 at 10:57 PM, Rogan Carr <rogan.carr@gmail.com> wrote:

> Hi All,
>
> I have opened an issue, REEF-1892 because file IO to WASB for REEF 0.17.x
> is broken.
>
> In REEF-1827 [2], the URI used to specify remote and local files were
> changed to use the "AbsolutePath". [3]
>
> This means that a file specified as "hdfs://my/file" becomes "/my/file"
> and the hdfs:// is assumed by the `dfs` command.
>
> This is fine if you are using vanilla HDFS, but for cases like Blob
> Storage in Azure, there is a special prefix, `wasb://` that is used instead
> of `hdfs://`. This means that the AbsolutePath method trims off the "wasb",
> and this Copy() function instead attempts to download the file from the
> local HDFS instead of WASB.
>
> Best,
> Rogan
>
> [1] https://issues.apache.org/jira/browse/REEF-1892
>
> [2] https://issues.apache.org/jira/browse/REEF-1827
>
> [3] The code in question
> public void Copy(Uri sourceUri, Uri destinationUri)
> {
>
> - _commandRunner.Run("dfs -cp " + sourceUri + " " + destinationUri);
>
> + _commandRunner.Run("dfs -cp " + sourceUri.AbsolutePath + " " +
> destinationUri.AbsolutePath);
>
> }
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message