beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stephen Sisk (JIRA)" <>
Subject [jira] [Commented] (BEAM-2005) Add a Hadoop FileSystem implementation of Beam's FileSystem
Date Thu, 20 Apr 2017 15:57:04 GMT


Stephen Sisk commented on BEAM-2005:

I don't want to derail this conversation, but I did have a couple other concerns - Beam's
FileSystem has a copy() command, however I can't find a good analog in Hadoop's FileSystem. shows lots
of copy to/from local files, but no "copy between these two arbitrary paths". 

I also believe that since Beam FileSystem objects are configured via PipelineOptions, we need
to pass a Hadoop Configuration through PipelineOptions. I think that's very solvable, but
it does seem semi-complicated.

I'm going to open subtasks for discussion so we can discuss in separate threads.

> Add a Hadoop FileSystem implementation of Beam's FileSystem
> -----------------------------------------------------------
>                 Key: BEAM-2005
>                 URL:
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-java-extensions
>            Reporter: Stephen Sisk
>            Assignee: Stephen Sisk
>             Fix For: First stable release
> Beam's FileSystem creates an abstraction for reading from files in many different places.

> We should add a Hadoop FileSystem implementation (
- that would enable us to read from any file system that implements FileSystem (including
HDFS, azure, s3, etc..)
> I'm investigating this now.

This message was sent by Atlassian JIRA

View raw message