hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: 3rd Party Hadoop FileSystems failing with UnsupportedFileSystemException
Date Fri, 21 Jun 2013 05:47:14 GMT
YARN uses the FileContext APIs in its code, which would require your
FS implementation to also provide one (inheriting the
AbstractFileSystem).

On Fri, Jun 21, 2013 at 6:38 AM, Stephen Watt <swatt@redhat.com> wrote:
> Hi Folks
>
> I'm working on the Hadoop FileSystem validation workstream (https://wiki.apache.org/hadoop/HCFS/Progress)
over at Hadoop Common. To do that we're building a library of Hadoop FileSystem tests that
will run against FileSystems configured within Hadoop 2.0. I have YARN working on HDFS and
LocalFS, next I'm trying to get YARN running on top of GlusterFS using the GlusterFS Hadoop
FileSystem plugin. The plugin works just fine on Hadoop 1.x.
>
> When I start the JobHistoryServer it fails with an UnsupportedFileSystemException (full
stack trace below). I did a bit of googling and ran into Karthik over at the QFS community
(https://groups.google.com/forum/#!topic/qfs-devel/KF3AAFheNq8) who had the same issue and
has also been unsuccessful at getting this working. I've provided my core-site file below.
The glusterfs plugin jar is copied into share/hadoop/common/lib/, share/hadoop/mapreduce/lib
and share/hadoop/yarn/lib so I don't think this is a classpath issue. Perhaps the exception
is a result of misconfiguration somewhere?
>
> -- Core Site --
>
> <configuration>
>
>  <property>
>   <name>fs.defaultFS</name>
>   <value>glusterfs://amb-1:9000</value>
>  </property>
>
>  <property>
>   <name>fs.default.name</name>
>   <value>glusterfs://amb-1:9000</value>
>  </property>
>
>  <property>
>   <name>fs.glusterfs.server</name>
>   <value>amb-1</value>
>  </property>
>
>  <property>
>   <name>fs.glusterfs.impl</name>
>   <value>org.apache.hadoop.fs.glusterfs.GlusterFileSystem</value>
>  </property>
>
> </configuration>
>
>
> -- Stack Trace --
>
> STARTUP_MSG:   build = git://pico-2-centos-6-3--01.hortonworks.com/home/jenkins/workspace/BIGTOP-BigWheelAplha-2-HDP-RPM-SYNC-REPO/label/centos6-3/build/hadoop/rpm/BUILD/hadoop-2.0.3.22-alpha-src/hadoop-common-project/hadoop-common
-r bdb84648f423eb2b7af5cb97c7192193a5a57956; compiled by 'jenkins' on Fri Mar 15 02:03:54
PDT 2013
> STARTUP_MSG:   java = 1.6.0_43
> ************************************************************/
> 2013-06-08 05:46:23,796 INFO org.apache.hadoop.mapreduce.v2.hs.JobHistory: JobHistory
Init
> 2013-06-08 05:46:24,015 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:root (auth:SIMPLE) cause:org.apache.hadoop.fs.UnsupportedFileSystemException: No AbstractFileSystem
for scheme: glusterfs
> 2013-06-08 05:46:24,015 FATAL org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer: Error
starting JobHistoryServer
> org.apache.hadoop.yarn.YarnException: Error creating done directory: [null]
>         at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.init(HistoryFileManager.java:424)
>         at org.apache.hadoop.mapreduce.v2.hs.JobHistory.init(JobHistory.java:87)
>         at org.apache.hadoop.yarn.service.CompositeService.init(CompositeService.java:58)
>         at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.init(JobHistoryServer.java:87)
>         at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.main(JobHistoryServer.java:145)
> Caused by: org.apache.hadoop.fs.UnsupportedFileSystemException: No AbstractFileSystem
for scheme: glusterfs
>         at org.apache.hadoop.fs.AbstractFileSystem.createFileSystem(AbstractFileSystem.java:146)
>         at org.apache.hadoop.fs.AbstractFileSystem.get(AbstractFileSystem.java:234)
>         at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:342)
>         at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:339)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
>         at org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:339)
>         at org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:453)
>         at org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:475)
>         at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.init(HistoryFileManager.java:417)
>
>
> ----- Original Message -----
> From: "Vinod Kumar Vavilapalli" <vinodkv@hortonworks.com>
> To: yarn-dev@hadoop.apache.org
> Sent: Thursday, June 20, 2013 6:32:42 PM
> Subject: Re: FileNotFoundExceptions with Pseudo Distributed YARN MR using the Local FileSystem
>
>
> Please let us know your final results. Interesting to see YARN+MR directly working on
local-file-system.
>
> Thanks,
> +Vinod
>
> On Jun 20, 2013, at 2:27 PM, Stephen Watt wrote:
>
>> I resolved this. The issue is that I was using relative paths (i.e "teragen 1000
data/in-dir") as the params for TeraGen and TeraSort. When I changed it to use absolute paths,
(i.e. "teragen 1000 /data/in-dir") it works.
>>
>> ----- Original Message -----
>> From: "Stephen Watt" <swatt@redhat.com>
>> To: yarn-dev@hadoop.apache.org
>> Sent: Thursday, June 20, 2013 12:25:17 PM
>> Subject: FileNotFoundExceptions with Pseudo Distributed YARN MR using the Local FileSystem
>>
>> Hi Folks
>>
>> I'm running into FileNotFoundExceptions when using using Pseudo Distributed Single
Node YARN using the Local FileSystem. I'd greatly appreciate any insights/solutions.
>>
>> To level set, I'm using RHEL 6.2 and I've successfully setup a single node pseudo-distributed
YARN on HDFS 2.0 using the HDP 2.0.2 Alpha Release (tarball extract to /opt). All the processes
were started and the jobs submitted as root. I ran some smoke tests with TeraGen and TeraSort
and it works great.
>>
>> The next step was to leave YARN in pseudo-distributed mode and stop HDFS and change
the Hadoop FileSystem from HDFS to the Local FileSystem. I stopped all the daemons, changed
the core-site.xml to use the Local FileSystem as demonstrated below, and then restarted the
resourcemanager, nodemanager and historyserver. Still running as root,  everything started
just fine. I ran TeraGen (params: 1000 data/in-dir) it worked fine. I then ran TeraSort (params:
data/in-dir data/out-dir) and the Job Failed with a FileNotFoundException. I've provided my
core-site and mapred-site below.
>>
>> -- core-site.xml --
>>
>> <configuration>
>>
>> <property>
>>   <name>fs.default.name</name>
>>    <value>file:///</value>
>> </property>
>>
>> </configuration>
>>
>> -- mapred-site.xml --
>>
>> <configuration>
>>
>>   <property>
>>      <name>mapreduce.framework.name</name>
>>      <value>yarn</value>
>>   </property>
>>
>> </configuration>
>>
>> -- Stack Trace Exception --
>>
>> 2013-06-18 23:06:40,876 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver:
Resolved yarn-1 to /default-rack
>> 2013-06-18 23:06:40,881 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator:
Assigned container container_1371596024885_0003_01_000002 to attempt_1371596024885_0003_m_000000_0
>> 2013-06-18 23:06:40,881 INFO [RMCommunicator Allocator] org.apache.hadoop.yarn.util.RackResolver:
Resolved yarn-1 to /default-rack
>> 2013-06-18 23:06:40,882 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator:
Assigned container container_1371596024885_0003_01_000003 to attempt_1371596024885_0003_m_000001_0
>> 2013-06-18 23:06:40,882 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator:
Recalculating schedule, headroom=4096
>> 2013-06-18 23:06:40,882 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator:
Reduce slow start threshold not met. completedMapsForReduceSlowstart 1
>> 2013-06-18 23:06:40,882 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator:
After Scheduling: PendingReds:1 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:2 AssignedReds:0
CompletedMaps:0 CompletedReds:0 ContAlloc:2 ContRel:0 HostLocal:0 RackLocal:2
>> 2013-06-18 23:06:40,896 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
The job-jar file on the remote FS is file:///tmp/hadoop-yarn/staging/root/.staging/job_1371596024885_0003/job.jar
>> 2013-06-18 23:06:40,901 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
The job-conf file on the remote FS is /tmp/hadoop-yarn/staging/root/.staging/job_1371596024885_0003/job.xml
>> 2013-06-18 23:06:40,902 FATAL [AsyncDispatcher event handler] org.apache.hadoop.yarn.event.AsyncDispatcher:
Error in dispatcher thread
>> org.apache.hadoop.yarn.YarnException: java.io.FileNotFoundException: File file:/opt/hadoop-2.0.3.22-alpha-hdp/nm-local-dir/usercache/root/appcache/application_1371596024885_0003/container_1371596024885_0003_01_000001/data/out-dir/_partition.lst#_partition.lst
does not exist
>>       at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createCommonContainerLaunchContext(TaskAttemptImpl.java:723)
>>       at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createContainerLaunchContext(TaskAttemptImpl.java:771)
>>       at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1352)
>>       at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1310)
>>       at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:359)
>>       at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:299)
>>       at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
>>       at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445)
>>       at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1018)
>>       at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:142)
>>       at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1116)
>>       at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1108)
>>       at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:130)
>>       at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77)
>>       at java.lang.Thread.run(Thread.java:662)
>> Caused by: java.io.FileNotFoundException: File file:/opt/hadoop-2.0.3.22-alpha-hdp/nm-local-dir/usercache/root/appcache/application_1371596024885_0003/container_1371596024885_0003_01_000001/data/out-dir/_partition.lst#_partition.lst
does not exist
>>       at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:492)
>>       at org.apache.hadoop.fs.FileSystem.resolvePath(FileSystem.java:697)
>>       at org.apache.hadoop.fs.FilterFileSystem.resolvePath(FilterFileSystem.java:144)
>>       at org.apache.hadoop.mapreduce.v2.util.MRApps.parseDistributedCacheArtifacts(MRApps.java:417)
>>       at org.apache.hadoop.mapreduce.v2.util.MRApps.setupDistributedCache(MRApps.java:365)
>>       at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createCommonContainerLaunchContext(TaskAttemptImpl.java:686)
>>       ... 14 more
>> 2013-06-18 23:06:40,906 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.event.AsyncDispatcher:
Exiting, bbye..



-- 
Harsh J

Mime
View raw message