beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Manoj Mathai (JIRA)" <>
Subject [jira] [Commented] (BEAM-2429) Conflicting filesystems with used of HadoopFileSystem
Date Tue, 11 Jul 2017 15:14:00 GMT


Manoj Mathai commented on BEAM-2429:

Hi  Fran├žois

Thanks a lot for your reply

I tired that. I am able to create the pipeline with the options set. 

But while refering to an hdfs file like below in my program
PCollection<String> lines = dataflowPipeline.apply("hdfs://<<host:port>>/tmp/tmp.txt"));

I am getting an error like org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.lang.IllegalArgumentException:
Error matching the pattern or glob hdfs://<<host:port>>/tmp/tmp.txt: status ERROR

I think I am missing some thing here. 

Do you have any code sample (or any link) which reads a HDFS file. 

Thanks and Regards

> Conflicting filesystems with used of HadoopFileSystem
> -----------------------------------------------------
>                 Key: BEAM-2429
>                 URL:
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-extensions
>    Affects Versions: 2.0.0
>            Reporter: Fran├žois Wagner
>            Assignee: Flavio Fiszman
>             Fix For: 2.0.0
> I'm facing issue when trying to use HadoopFileSystem in my pipeline. It looks like HadoopFileSystem
is registring itself under the `file` schema (,
hence the following Exception is thrown when trying to register HadoopFileSystem.
> java.lang.IllegalStateException: Scheme: [file] has conflicting filesystems: [,]
> 	at
> What is the correct way to handle `hdfs` url out of the box with TextIO & AvroIO
> {code:java}
>     String[] args = new String[]{
>         "--hdfsConfiguration=[{\"dfs.client.use.datanode.hostname\": \"true\"}]"};
>     HadoopFileSystemOptions options = PipelineOptionsFactory
>         .fromArgs(args)
>         .withValidation()
>         .as(HadoopFileSystemOptions.class);
>     Pipeline pipeline = Pipeline.create(options); 
> {code}

This message was sent by Atlassian JIRA

View raw message