Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-issues@hadoop.apache.org
Date: Tue, 26 Apr 2016 17:16:13 +0000 (UTC)
From: "Thomas Hille (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: <JIRA.12962408.1461621721000.36887.1461690973056@Atlassian.JIRA>
In-Reply-To: <JIRA.12962408.1461621721000@Atlassian.JIRA>
References: <JIRA.12962408.1461621721000@Atlassian.JIRA>
 <JIRA.12962408.1461621721598@arcas>
Subject: [jira] [Commented] (HDFS-10327) Open files in WEBHDFS which are
 stored in folders by Spark/Mapreduce
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HDFS-10327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15258488#comment-15258488 ] 

Thomas Hille commented on HDFS-10327:
-------------------------------------

Hi guys,
It looks like splitting the file in parts is a mapreduce feature rather than spak specific (https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html -- look at the output of $bin/hadoop dfs -cat /usr/joe/wordcount/output/part-00000).
So its maybe still something for you guys?

> Open files in WEBHDFS which are stored in folders by Spark/Mapreduce
> --------------------------------------------------------------------
>
>                 Key: HDFS-10327
>                 URL: https://issues.apache.org/jira/browse/HDFS-10327
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: webhdfs
>            Reporter: Thomas Hille
>              Labels: features
>
> When Spark saves a file in HDFS it creates a directory which includes many parts of the file. When you read it with spark programmatically, you can read this directory as it is a normal file.
> If you try to read this directory-style file in webhdfs, it returns 
> {"exception":"FileNotFoundException","javaClassName":"java.io.FileNotFoundException","message":"Path is not a file: [...]


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)