hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun C Murthy <...@hortonworks.com>
Subject Re: Running YARN on top of legacy HDFS (i.e. 0.20)
Date Fri, 09 Dec 2011 23:15:23 GMT
I assume you have security switched off.

What issues are you running into?

On Dec 8, 2011, at 1:30 PM, Avery Ching wrote:

> I was able to convert FileContext to FileSystem and related methods fairly straightforwardly,
but am running into issues of dealing with security incompatibilites (i.e. UserGroupInformation,
etc.).  Yuck.
> 
> Avery
> 
> On 12/6/11 3:50 PM, Arun C Murthy wrote:
>> Avery,
>> 
>> If you could take a look at what it would take, I'd be grateful. I'm hoping it isn't
very much effort.
>> 
>> thanks,
>> Arun
>> 
>> On Dec 6, 2011, at 10:05 AM, Avery Ching wrote:
>> 
>>> I think it would be nice if YARN could work on existing older HDFS instances,
a lot of folks will be slow to upgrade HDFS with all their important data on it.  I could
also go that route I guess.
>>> 
>>> Avery
>>> 
>>> On 12/6/11 8:51 AM, Arun C Murthy wrote:
>>>> Avery,
>>>> 
>>>>  They aren't 'api changes'. HDFS just has a new set of apis in hadoop-0.23
(aka FileContext apis). Both the old (FileSystem apis) and new are supported in hadoop-0.23.
>>>> 
>>>>  We have used the new HDFS apis in YARN in some places.
>>>> 
>>>> hth,
>>>> Arun
>>>> 
>>>> On Dec 5, 2011, at 10:59 PM, Avery Ching wrote:
>>>> 
>>>>> Thank you for the response, that's what I thought as well =).  I spent
the day trying to port the required 0.23 APIs to 0.20 HDFS.  There have been a lot of API
changes!
>>>>> 
>>>>> Avery
>>>>> 
>>>>> On 12/5/11 9:14 PM, Mahadev Konar wrote:
>>>>>> Avery,
>>>>>>  Currently we have only tested 0.23 MRv2 with 0.23 hdfs. I might
be
>>>>>> wrong but looking at the HDFS apis' it doesnt look like that it would
>>>>>> be a lot of work to getting it to work with 0.20 apis. We had been
>>>>>> using filecontext api's initially but have transitioned back to the
>>>>>> old API's.
>>>>>> 
>>>>>> Hope that helps.
>>>>>> 
>>>>>> mahadev
>>>>>> 
>>>>>> On Mon, Dec 5, 2011 at 4:01 PM, Avery Ching<aching@apache.org>
   wrote:
>>>>>>> Hi,
>>>>>>> 
>>>>>>> I've been playing with 0.23.0, really nice stuff!  I was able
to setup a
>>>>>>> small test cluster (40 nodes) and launch the example jobs.  I
was also able
>>>>>>> to recompile old Hadoop programs with the new jars and start
up those
>>>>>>> programs as well.  My question is the following:
>>>>>>> 
>>>>>>> We have an HDFS instance based on 0.20 that I would like to hook
up to YARN.
>>>>>>>  This appears to be a bit of work.  Launching the jobs gives
me the
>>>>>>> following error:
>>>>>>> 
>>>>>>> 2011-12-05 15:48:05,023 INFO  ipc.YarnRPC (YarnRPC.java:create(47))
-
>>>>>>> Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
>>>>>>> 2011-12-05 15:48:05,040 INFO  mapred.ResourceMgrDelegate
>>>>>>> (ResourceMgrDelegate.java:<init>(95)) - Connecting to ResourceManager
at
>>>>>>> {removed}.{xxx}/{removed}:50177
>>>>>>> 2011-12-05 15:48:05,041 INFO  ipc.HadoopYarnRPC
>>>>>>> (HadoopYarnProtoRPC.java:getProxy(48)) - Creating a HadoopYarnProtoRpc
proxy
>>>>>>> for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol
>>>>>>> 2011-12-05 15:48:05,121 INFO  mapred.ResourceMgrDelegate
>>>>>>> (ResourceMgrDelegate.java:<init>(99)) - Connected to ResourceManager
at
>>>>>>> {removed}.{xxx}/{removed}:50177
>>>>>>> 2011-12-05 15:48:05,133 INFO  mapreduce.Cluster
>>>>>>> (Cluster.java:initialize(116)) - Failed to use
>>>>>>> org.apache.hadoop.mapred.YarnClientProtocolProvider due to error:
>>>>>>> java.lang.ClassNotFoundException: org.apache.hadoop.fs.Hdfs
>>>>>>> Exception in thread "main" java.io.IOException: Cannot initialize
Cluster.
>>>>>>> Please check your configuration for mapreduce.framework.name
and the
>>>>>>> correspond server addresses.
>>>>>>>    at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:123)
>>>>>>>    at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:85)
>>>>>>>    at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:78)
>>>>>>>    at org.apache.hadoop.mapreduce.Job$1.run(Job.java:1129)
>>>>>>>    at org.apache.hadoop.mapreduce.Job$1.run(Job.java:1125)
>>>>>>>    at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>    at javax.security.auth.Subject.doAs(Subject.java:396)
>>>>>>>    at
>>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
>>>>>>>    at org.apache.hadoop.mapreduce.Job.connect(Job.java:1124)
>>>>>>>    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1153)
>>>>>>>    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1176)
>>>>>>>    at org.apache.giraph.graph.GiraphJob.run(GiraphJob.java:560)
>>>>>>>    at
>>>>>>> org.apache.giraph.benchmark.PageRankBenchmark.run(PageRankBenchmark.java:193)
>>>>>>>    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
>>>>>>>    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:83)
>>>>>>>    at
>>>>>>> org.apache.giraph.benchmark.PageRankBenchmark.main(PageRankBenchmark.java:201)
>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>>    at
>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>>>    at
>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>>>    at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>>>    at org.apache.hadoop.util.RunJar.main(RunJar.java:189)
>>>>>>> 
>>>>>>> After doing a little digging it appears that YarnClientProtocolProvider
>>>>>>> creates a YARNRunner that uses org.apache.hadoop.fs.Hdfs, a class
that is
>>>>>>> not available available in older versions of HDFS.
>>>>>>> 
>>>>>>> What versions of HDFS are currently supported and what HDFS versions
are
>>>>>>> planned for support?  It would be great to be able to run YARN
on legacy
>>>>>>> HDFS installations.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>> Avery
> 


Mime
View raw message