hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephen Watt <sw...@redhat.com>
Subject 3rd Party Hadoop FileSystems failing with UnsupportedFileSystemException
Date Fri, 21 Jun 2013 01:08:18 GMT
Hi Folks

I'm working on the Hadoop FileSystem validation workstream (https://wiki.apache.org/hadoop/HCFS/Progress)
over at Hadoop Common. To do that we're building a library of Hadoop FileSystem tests that
will run against FileSystems configured within Hadoop 2.0. I have YARN working on HDFS and
LocalFS, next I'm trying to get YARN running on top of GlusterFS using the GlusterFS Hadoop
FileSystem plugin. The plugin works just fine on Hadoop 1.x. 

When I start the JobHistoryServer it fails with an UnsupportedFileSystemException (full stack
trace below). I did a bit of googling and ran into Karthik over at the QFS community (https://groups.google.com/forum/#!topic/qfs-devel/KF3AAFheNq8)
who had the same issue and has also been unsuccessful at getting this working. I've provided
my core-site file below. The glusterfs plugin jar is copied into share/hadoop/common/lib/,
share/hadoop/mapreduce/lib and share/hadoop/yarn/lib so I don't think this is a classpath
issue. Perhaps the exception is a result of misconfiguration somewhere? 

-- Core Site --

<configuration>

 <property>
  <name>fs.defaultFS</name>
  <value>glusterfs://amb-1:9000</value>
 </property>

 <property>
  <name>fs.default.name</name>
  <value>glusterfs://amb-1:9000</value>
 </property>

 <property>
  <name>fs.glusterfs.server</name>
  <value>amb-1</value>
 </property>

 <property>
  <name>fs.glusterfs.impl</name>
  <value>org.apache.hadoop.fs.glusterfs.GlusterFileSystem</value>
 </property>

</configuration>


-- Stack Trace -- 

STARTUP_MSG:   build = git://pico-2-centos-6-3--01.hortonworks.com/home/jenkins/workspace/BIGTOP-BigWheelAplha-2-HDP-RPM-SYNC-REPO/label/centos6-3/build/hadoop/rpm/BUILD/hadoop-2.0.3.22-alpha-src/hadoop-common-project/hadoop-common
-r bdb84648f423eb2b7af5cb97c7192193a5a57956; compiled by 'jenkins' on Fri Mar 15 02:03:54
PDT 2013
STARTUP_MSG:   java = 1.6.0_43
************************************************************/
2013-06-08 05:46:23,796 INFO org.apache.hadoop.mapreduce.v2.hs.JobHistory: JobHistory Init
2013-06-08 05:46:24,015 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:root (auth:SIMPLE) cause:org.apache.hadoop.fs.UnsupportedFileSystemException: No AbstractFileSystem
for scheme: glusterfs
2013-06-08 05:46:24,015 FATAL org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer: Error starting
JobHistoryServer
org.apache.hadoop.yarn.YarnException: Error creating done directory: [null]
	at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.init(HistoryFileManager.java:424)
	at org.apache.hadoop.mapreduce.v2.hs.JobHistory.init(JobHistory.java:87)
	at org.apache.hadoop.yarn.service.CompositeService.init(CompositeService.java:58)
	at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.init(JobHistoryServer.java:87)
	at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.main(JobHistoryServer.java:145)
Caused by: org.apache.hadoop.fs.UnsupportedFileSystemException: No AbstractFileSystem for
scheme: glusterfs
	at org.apache.hadoop.fs.AbstractFileSystem.createFileSystem(AbstractFileSystem.java:146)
	at org.apache.hadoop.fs.AbstractFileSystem.get(AbstractFileSystem.java:234)
	at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:342)
	at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:339)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
	at org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:339)
	at org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:453)
	at org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:475)
	at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.init(HistoryFileManager.java:417)


----- Original Message -----
From: "Vinod Kumar Vavilapalli" <vinodkv@hortonworks.com>
To: yarn-dev@hadoop.apache.org
Sent: Thursday, June 20, 2013 6:32:42 PM
Subject: Re: FileNotFoundExceptions with Pseudo Distributed YARN MR using the Local FileSystem


Please let us know your final results. Interesting to see YARN+MR directly working on local-file-system.

Thanks,
+Vinod

On Jun 20, 2013, at 2:27 PM, Stephen Watt wrote:

> I resolved this. The issue is that I was using relative paths (i.e "teragen 1000 data/in-dir")
as the params for TeraGen and TeraSort. When I changed it to use absolute paths, (i.e. "teragen
1000 /data/in-dir") it works.
> 
> ----- Original Message -----
> From: "Stephen Watt" <swatt@redhat.com>
> To: yarn-dev@hadoop.apache.org
> Sent: Thursday, June 20, 2013 12:25:17 PM
> Subject: FileNotFoundExceptions with Pseudo Distributed YARN MR using the Local FileSystem
> 
> Hi Folks
> 
> I'm running into FileNotFoundExceptions when using using Pseudo Distributed Single Node
YARN using the Local FileSystem. I'd greatly appreciate any insights/solutions.
> 
> To level set, I'm using RHEL 6.2 and I've successfully setup a single node pseudo-distributed
YARN on HDFS 2.0 using the HDP 2.0.2 Alpha Release (tarball extract to /opt). All the processes
were started and the jobs submitted as root. I ran some smoke tests with TeraGen and TeraSort
and it works great.
> 
> The next step was to leave YARN in pseudo-distributed mode and stop HDFS and change the
Hadoop FileSystem from HDFS to the Local FileSystem. I stopped all the daemons, changed the
core-site.xml to use the Local FileSystem as demonstrated below, and then restarted the resourcemanager,
nodemanager and historyserver. Still running as root,  everything started just fine. I ran
TeraGen (params: 1000 data/in-dir) it worked fine. I then ran TeraSort (params: data/in-dir
data/out-dir) and the Job Failed with a FileNotFoundException. I've provided my core-site
and mapred-site below.
> 
> -- core-site.xml --
> 
> <configuration>
> 
> <property>
>   <name>fs.default.name</name>
>    <value>file:///</value>
> </property>
> 
> </configuration>
> 
> -- mapred-site.xml --
> 
> <configuration>
> 
>   <property>
>      <name>mapreduce.framework.name</name>
>      <value>yarn</value>
>   </property>
> 
> </configuration>
> 
> -- Stack Trace Exception -- 
> 
> 2013-06-18 23:06:40,876 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver:
Resolved yarn-1 to /default-rack
> 2013-06-18 23:06:40,881 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator:
Assigned container container_1371596024885_0003_01_000002 to attempt_1371596024885_0003_m_000000_0
> 2013-06-18 23:06:40,881 INFO [RMCommunicator Allocator] org.apache.hadoop.yarn.util.RackResolver:
Resolved yarn-1 to /default-rack
> 2013-06-18 23:06:40,882 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator:
Assigned container container_1371596024885_0003_01_000003 to attempt_1371596024885_0003_m_000001_0
> 2013-06-18 23:06:40,882 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator:
Recalculating schedule, headroom=4096
> 2013-06-18 23:06:40,882 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator:
Reduce slow start threshold not met. completedMapsForReduceSlowstart 1
> 2013-06-18 23:06:40,882 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator:
After Scheduling: PendingReds:1 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:2 AssignedReds:0
CompletedMaps:0 CompletedReds:0 ContAlloc:2 ContRel:0 HostLocal:0 RackLocal:2
> 2013-06-18 23:06:40,896 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
The job-jar file on the remote FS is file:///tmp/hadoop-yarn/staging/root/.staging/job_1371596024885_0003/job.jar
> 2013-06-18 23:06:40,901 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
The job-conf file on the remote FS is /tmp/hadoop-yarn/staging/root/.staging/job_1371596024885_0003/job.xml
> 2013-06-18 23:06:40,902 FATAL [AsyncDispatcher event handler] org.apache.hadoop.yarn.event.AsyncDispatcher:
Error in dispatcher thread
> org.apache.hadoop.yarn.YarnException: java.io.FileNotFoundException: File file:/opt/hadoop-2.0.3.22-alpha-hdp/nm-local-dir/usercache/root/appcache/application_1371596024885_0003/container_1371596024885_0003_01_000001/data/out-dir/_partition.lst#_partition.lst
does not exist
> 	at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createCommonContainerLaunchContext(TaskAttemptImpl.java:723)
> 	at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createContainerLaunchContext(TaskAttemptImpl.java:771)
> 	at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1352)
> 	at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1310)
> 	at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:359)
> 	at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:299)
> 	at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
> 	at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445)
> 	at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1018)
> 	at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:142)
> 	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1116)
> 	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1108)
> 	at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:130)
> 	at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77)
> 	at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.FileNotFoundException: File file:/opt/hadoop-2.0.3.22-alpha-hdp/nm-local-dir/usercache/root/appcache/application_1371596024885_0003/container_1371596024885_0003_01_000001/data/out-dir/_partition.lst#_partition.lst
does not exist
> 	at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:492)
> 	at org.apache.hadoop.fs.FileSystem.resolvePath(FileSystem.java:697)
> 	at org.apache.hadoop.fs.FilterFileSystem.resolvePath(FilterFileSystem.java:144)
> 	at org.apache.hadoop.mapreduce.v2.util.MRApps.parseDistributedCacheArtifacts(MRApps.java:417)
> 	at org.apache.hadoop.mapreduce.v2.util.MRApps.setupDistributedCache(MRApps.java:365)
> 	at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createCommonContainerLaunchContext(TaskAttemptImpl.java:686)
> 	... 14 more
> 2013-06-18 23:06:40,906 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.event.AsyncDispatcher:
Exiting, bbye..

Mime
View raw message