spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jay vyas <jayunit100.apa...@gmail.com>
Subject Re: Access to hdfs FileSystem through Spark
Date Fri, 10 Apr 2015 02:43:40 GMT
whoa ! sorry about the typos above, i tried to refactor the email and it
sent. anyway, you get the idea :).....

basically spark context will read hadoop settings in the same way yarn
does... so if already on a hadoop cluster, it should work quite naturally
without needing to explicitly set anything at all.

On Thu, Apr 9, 2015 at 10:41 PM, jay vyas <jayunit100.apache@gmail.com>
wrote:

> if already on a hadoop cluster, it should just work ootb.
> Spark is smart enough to work on hadoop filesystems... it reads hadoop
> conf on an existing normal HCFS
> cluster and sc.textFile will just use whatever your default hadoop fs uri
> is.
>
> In general any this is quite easy to test... once spark is setup properly,
> it should naturally
> load text files useing spark context from the
>
> 1) then put a file into your HCFS file system.
>
> hadoop fs -put /etc/passwd /etc/passwd
>
> 2) Then just confirm spark sees it...
>
> val lines = sc.textFile("/tmp/passwd")
>
> lines.collect can print this out for you.
>
> As a test of this... you can just use ASF BigTop's spark vagrant recipes
> :  we dont do anything special, and I found hdfs integration "just worked",
> since by default we deploy with hadoop configuration for HDFS.
>
>
>
> On Thu, Apr 9, 2015 at 8:38 PM, Sean Owen <sowen@cloudera.com> wrote:
>
>> What you have there is how to do it although you want to use
>> sc.hadoopConfiguration IIRC.
>> On Apr 9, 2015 8:26 PM, "Ulanov, Alexander" <alexander.ulanov@hp.com>
>> wrote:
>>
>> > Hi,
>> >
>> > Is there a way to access hdfs FileSystem through Spark? For example, I
>> > need to check the file size before opening it with
>> sc.binaryFile("hdfs://
>> > mynetwork.com:9000/myfile"). Can I do it without creating hadoop
>> > FileSystem by myself ?
>> >
>> > val fs = FileSystem.get(new URI("hdfs://mynetwork.com:9000"), new
>> > Configuration())
>> >
>> > Best regards, Alexander
>> >
>>
>
>
>
> --
> jay vyas
>



-- 
jay vyas

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message