giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Junghanns <martin.jungha...@gmx.net>
Subject Re: Problem Giraph + Hadoop + HBase
Date Mon, 23 Feb 2015 10:27:35 GMT
I found the solution. For the GiraphJob it is necessary to

1) add HBase libs to HADOOP_CLASSPATH on all machines
2) add (comma-separated) HBase libs via -libjars parameter when running
the Driver

I still don't know why this is not necessary for regular MapReduce jobs,
but I like this solution as I don't need to build a fat jar anymore.

Cheers,
Martin

On 21.02.2015 13:20, Martin Junghanns wrote:
> Hi all,
> 
> this might be a bit specific question and I don't know if the problem is
> Giraph, Hadoop or HBase related
> but maybe someone has an idea.
> 
> I am running an application on a cluster using:
> 
> Hadoop 2.5.1
> Giraph 1.1.0-hadoop2
> HBase 0.98.10.1-hadoop2
> 
> Giraph jobs run fine when I start them via the GiraphRunner using text
> base input formats. My application is a
> fat-jar containing Giraph libs, but not HBase libs (provided). HBase
> libs are in the HADOOP_CLASSPATH and
> MapReduce jobs using HBase as data source / sink run fine.
> 
> The problem occurs when I start a GiraphJob from my Driver program. The
> driver does the following:
> 1) Bulk Load text data into HBase via MapReduce
> 2) Run a Giraph algorithm using HBase as data source (using
> TableInputFormat)
> 
> The *driver runs fine in a unit test* using the MiniCluster.
> 
> When I start the driver on a cluster,  1) runs successful but after the
> GiraphJob is submitted, I get  a:
> 
> 2015-02-21 12:50:38,954 INFO [main]
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in
> config null
> 2015-02-21 12:50:39,018 FATAL [main]
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
> java.lang.NoClassDefFoundError:
> org/apache/hadoop/hbase/mapreduce/TableInputFormat
>     at
> org.myapp.io.HBaseVertexInputFormat.<clinit>(HBaseVertexInputFormat.java:48)
> 
>     at java.lang.Class.forName0(Native Method)
>     at java.lang.Class.forName(Class.java:274)
>     at
> org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1844)
> 
>     at
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1809)
> 
>     at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1903)
>     at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1929)
>     at org.apache.giraph.conf.ClassConfOption.get(ClassConfOption.java:128)
>     at org.apache.giraph.conf.GiraphClasses.<init>(GiraphClasses.java:180)
>     at
> org.apache.giraph.conf.ImmutableClassesGiraphConfiguration.<init>(ImmutableClassesGiraphConfiguration.java:138)
> 
>     at
> org.apache.giraph.bsp.BspOutputFormat.getOutputCommitter(BspOutputFormat.java:62)
> 
>     at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:473)
> 
>     at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:376)
> 
>     at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>     at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1485)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:415)
>     at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> 
>     at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1482)
> 
>     at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1415)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.hbase.mapreduce.TableInputFormat
>     at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>     at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>     ... 20 more
> 2015-02-21 12:50:39,021 INFO [main] org.apache.hadoop.util.ExitUtil:
> Exiting with status 1
> 
> HBaseVertexInputFormat.java:48: protected static final TableInputFormat
> BASE_FORMAT = new TableInputFormat();
> 
> The class*org/apache/hadoop/hbase/mapreduce/TableInputFormat*  is
> contained in*hbase-server-0.98.10.1-hadoop2.jar*  which
> is in the HADOOP_CLASSPATH and - according the the nodemanager logs -
> gets downloaded from staging when the application runs.
> 
> The GiraphJob is initialized in the driver the following way:
> 
> //...
> conf.set(TableInputFormat.INPUT_TABLE, MY_TABLE);
> conf.set(TableOutputFormat.OUTPUT_TABLE, MY_TABLE);
> 
> GiraphJob job = new GiraphJob(conf, JOB_NAME);
> GiraphConfiguration giraphConf = job.getConfiguration();
> giraphConf.setComputationClass(MyComputation.class);
> giraphConf.setVertexInputFormatClass(MyHBaseVertexInputFormat.class);
> giraphConf.setVertexOutputFormatClass(MyHBaseVertexOutputFormat.class);
> giraphConf.setWorkerConfiguration(workerCount, workerCount, 100f);
> 
> job.run(verbose);
> //...
> 
> Fyi, the*driver ran fine on a Hadoop 1.2.1 cluster with hbase and giraph
> libs (hadoop1) packaged in my jar*.
> But since this is not really necessary (at least for HBase), there seems
> to be a problem loading the jars in the GiraphJob.
> 
> Hope you guys have any ideas.
> 
> Thanks in advance.
> 
> Cheers,
> Martin
> 
> 
> 
> 
> 
> 
> 
> 

Mime
View raw message