beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stephen Sisk (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BEAM-2069) Remove ResourceId.getCurrentDirectory()?
Date Mon, 24 Apr 2017 23:41:04 GMT

    [ https://issues.apache.org/jira/browse/BEAM-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15982123#comment-15982123
] 

Stephen Sisk commented on BEAM-2069:
------------------------------------

It may be worth considering moving directory over onto the FileSystem implementation - it's
not clear that a wrapper around a string (which is what resourceId is) is ever going to be
able to answer this question, and it's how hadoop implements this. (org.apache.hadoop...FileSystem
has an isDirectory() method)

It was pointed out to me that I could inject hadoop's FileSystem into the ResourceId and use
isDirectory there to solve this problem for hadoop, so there is likely a solution for hadoop.

> Remove ResourceId.getCurrentDirectory()?
> ----------------------------------------
>
>                 Key: BEAM-2069
>                 URL: https://issues.apache.org/jira/browse/BEAM-2069
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-core
>    Affects Versions: First stable release
>            Reporter: Stephen Sisk
>            Assignee: Davor Bonaci
>              Labels: backward-incompatible
>
> Beam ResourceId currently has a getCurrentDirectory method that returns the current resource
id if it's a directory, or the parent directory if it's a directory.
> To implement this you need to know whether or not a particular path is a directory or
not.
> I'm trying to implement the Hadoop ResourceId implementation, and it's not clear if it's
possible. Hadoop's Paths do not end a / if they are a directory (they are stripped), nor do
hadoop paths tell you if something is a directory, so it's not possible to determine if a
given path is a file that does not have a suffix, or a directory.
> It's not clear to me that all file systems can determine whether a path is a directory
and thus I don't believe it can be implemented reliably.
> The only usages of getCurrentDirectory that I could find are in tests so it's not clear
we actually need this.
> I propose that we remove this method.
> cc [~davor]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message