hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun C Murthy <...@hortonworks.com>
Subject Re: Running YARN on top of legacy HDFS (i.e. 0.20)
Date Tue, 06 Dec 2011 16:51:34 GMT
Avery, 

 They aren't 'api changes'. HDFS just has a new set of apis in hadoop-0.23 (aka FileContext
apis). Both the old (FileSystem apis) and new are supported in hadoop-0.23.

 We have used the new HDFS apis in YARN in some places.

hth,
Arun

On Dec 5, 2011, at 10:59 PM, Avery Ching wrote:

> Thank you for the response, that's what I thought as well =).  I spent the day trying
to port the required 0.23 APIs to 0.20 HDFS.  There have been a lot of API changes!
> 
> Avery
> 
> On 12/5/11 9:14 PM, Mahadev Konar wrote:
>> Avery,
>>  Currently we have only tested 0.23 MRv2 with 0.23 hdfs. I might be
>> wrong but looking at the HDFS apis' it doesnt look like that it would
>> be a lot of work to getting it to work with 0.20 apis. We had been
>> using filecontext api's initially but have transitioned back to the
>> old API's.
>> 
>> Hope that helps.
>> 
>> mahadev
>> 
>> On Mon, Dec 5, 2011 at 4:01 PM, Avery Ching<aching@apache.org>  wrote:
>>> Hi,
>>> 
>>> I've been playing with 0.23.0, really nice stuff!  I was able to setup a
>>> small test cluster (40 nodes) and launch the example jobs.  I was also able
>>> to recompile old Hadoop programs with the new jars and start up those
>>> programs as well.  My question is the following:
>>> 
>>> We have an HDFS instance based on 0.20 that I would like to hook up to YARN.
>>>  This appears to be a bit of work.  Launching the jobs gives me the
>>> following error:
>>> 
>>> 2011-12-05 15:48:05,023 INFO  ipc.YarnRPC (YarnRPC.java:create(47)) -
>>> Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
>>> 2011-12-05 15:48:05,040 INFO  mapred.ResourceMgrDelegate
>>> (ResourceMgrDelegate.java:<init>(95)) - Connecting to ResourceManager at
>>> {removed}.{xxx}/{removed}:50177
>>> 2011-12-05 15:48:05,041 INFO  ipc.HadoopYarnRPC
>>> (HadoopYarnProtoRPC.java:getProxy(48)) - Creating a HadoopYarnProtoRpc proxy
>>> for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol
>>> 2011-12-05 15:48:05,121 INFO  mapred.ResourceMgrDelegate
>>> (ResourceMgrDelegate.java:<init>(99)) - Connected to ResourceManager at
>>> {removed}.{xxx}/{removed}:50177
>>> 2011-12-05 15:48:05,133 INFO  mapreduce.Cluster
>>> (Cluster.java:initialize(116)) - Failed to use
>>> org.apache.hadoop.mapred.YarnClientProtocolProvider due to error:
>>> java.lang.ClassNotFoundException: org.apache.hadoop.fs.Hdfs
>>> Exception in thread "main" java.io.IOException: Cannot initialize Cluster.
>>> Please check your configuration for mapreduce.framework.name and the
>>> correspond server addresses.
>>>    at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:123)
>>>    at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:85)
>>>    at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:78)
>>>    at org.apache.hadoop.mapreduce.Job$1.run(Job.java:1129)
>>>    at org.apache.hadoop.mapreduce.Job$1.run(Job.java:1125)
>>>    at java.security.AccessController.doPrivileged(Native Method)
>>>    at javax.security.auth.Subject.doAs(Subject.java:396)
>>>    at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
>>>    at org.apache.hadoop.mapreduce.Job.connect(Job.java:1124)
>>>    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1153)
>>>    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1176)
>>>    at org.apache.giraph.graph.GiraphJob.run(GiraphJob.java:560)
>>>    at
>>> org.apache.giraph.benchmark.PageRankBenchmark.run(PageRankBenchmark.java:193)
>>>    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
>>>    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:83)
>>>    at
>>> org.apache.giraph.benchmark.PageRankBenchmark.main(PageRankBenchmark.java:201)
>>>    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>    at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>    at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>    at java.lang.reflect.Method.invoke(Method.java:597)
>>>    at org.apache.hadoop.util.RunJar.main(RunJar.java:189)
>>> 
>>> After doing a little digging it appears that YarnClientProtocolProvider
>>> creates a YARNRunner that uses org.apache.hadoop.fs.Hdfs, a class that is
>>> not available available in older versions of HDFS.
>>> 
>>> What versions of HDFS are currently supported and what HDFS versions are
>>> planned for support?  It would be great to be able to run YARN on legacy
>>> HDFS installations.
>>> 
>>> Thanks,
>>> 
>>> Avery
> 


Mime
View raw message