hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jamal B <jm151...@gmail.com>
Subject Re: Externally submitted MapReduce Job Fails at Startup Help Please...
Date Wed, 02 Nov 2011 04:51:17 GMT
So I finally figured out what was going on.  To make a long story short, my
jar's lib folder contained transitive dependencies from dependencies I had
left in my pom.xml (spring, slf4j, etc..) ...typicall copy and paste
problem on my part... .

I found this by giving up on the remote submission, and just trying to use
the command line like previously suggested first to at least see if my
simple job would run.  Turns out, I had a conflcting slf4j jar causing my
submission to fail with a NoSuchMethod exception.  A couple of searches,
and I came accross this email.

http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201101.mbox/%3C4D3E3C87.7090108@gmail.com%3E

I replaced the version of slf4j in hadoop, restarted my test cluster, and
things worked like a charm (both using the command line, & remote
submission).

Learned alot :), and thanks for all the help.

On Sat, Oct 29, 2011 at 4:00 PM, Steve Lewis <lordjoe2000@gmail.com> wrote:

> Did you build a jar file for your job and did you put mysql-cinnector,jar
> in
> its lib directory???
> I have had this work for me
>
> On Fri, Oct 28, 2011 at 12:56 PM, Jamal x <jm15119b@gmail.com> wrote:
>
> > Thanks for the response.
> >
> > I need to submit this job programatically, instead of using the command
> > line.  Shouldn't the distributedCache class method handle the classpath
> > setup for the job?  If not, is there some other setup missing from my
> > driver
> > class?
> >
> > I also, looked into sqoop, but wanted to get this working for a
> particular
> > case which I think isn't a good fit fot it,but I may be wrong.  Plus,
> > wanted
> > to use this usecase for getting more experience with creating and running
> > jobs remotely.
> >
> > Thanks
> > On Oct 28, 2011 1:38 PM, "Brock Noland" <brock@cloudera.com> wrote:
> >
> > > Hi,
> > >
> > > I always find that using the -libjars command line option is the
> > > easiest way to push jars to the cluster.
> > >
> > > Also, you may want to checkout Apache Sqoop:
> > > http://incubator.apache.org/projects/sqoop.html
> > >
> > > Brock
> > >
> > > On Fri, Oct 28, 2011 at 12:17 PM, Jamal x <jm15119b@gmail.com> wrote:
> > > > Hi,
> > > >
> > > > I wrote a small test program to perform a simple database extraction
> of
> > > > information from a simple table on a remote cluster.  However, it
> fails
> > > to
> > > > execute successfully when I run from eclipse it with the following
> > > > exception:
> > > >
> > > > 12:36:08,993  WARN main mapred.JobClient:659 - Use
> GenericOptionsParser
> > > for
> > > > parsing the arguments. Applications should implement Tool for the
> same.
> > > > 12:36:09,567  WARN main mapred.JobClient:776 - No job jar file set.
> >  User
> > > > classes may not be found. See JobConf(Class) or
> JobConf#setJar(String).
> > > > java.lang.RuntimeException: Error in configuring object
> > > >    at
> > > >
> > >
> >
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
> > > >    at
> > > >
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
> > > >    at
> > > >
> > >
> >
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> > > >
> > > >    at
> org.apache.hadoop.mapred.JobConf.getInputFormat(JobConf.java:575)
> > > >    at
> > > >
> > >
> >
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:197)
> > > >
> > > >    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
> > > >    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
> > > >    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> > > >    at java.security.AccessController.doPrivileged(Native Method)
> > > >    at javax.security.auth.Subject.doAs(Subject.java:396)
> > > >    at
> > > >
> > >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
> > > >
> > > >    at org.apache.hadoop.mapred.Child.main(Child.java:249)
> > > > Caused by: java.lang.reflect.InvocationTargetException
> > > >    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > >    at
> > > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > > >
> > > >    at
> > > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > > >
> > > >    at java.lang.reflect.Method.invoke(Method.java:597)
> > > >    at
> > > >
> > >
> >
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
> > > >    ... 11 more
> > > > Caused by: java.lang.RuntimeException:
> > java.lang.ClassNotFoundException:
> > > > com.mysql.jdbc.Driver
> > > >    at
> > > >
> > >
> >
> org.apache.hadoop.mapred.lib.db.DBInputFormat.configure(DBInputFormat.java:271)
> > > >
> > > >    ... 16 more
> > > > Caused by: java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
> > > >    at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> > > >    at java.security.AccessController.doPrivileged(Native Method)
> > > >    at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> > > >    at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
> > > >    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> > > >    at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
> > > >    at java.lang.Class.forName0(Native Method)
> > > >    at java.lang.Class.forName(Class.java:169)
> > > >    at
> > > >
> > >
> >
> org.apache.hadoop.mapred.lib.db.DBConfiguration.getConnection(DBConfiguration.java:123)
> > > >
> > > >    at
> > > >
> > >
> >
> org.apache.hadoop.mapred.lib.db.DBInputFormat.configure(DBInputFormat.java:266)
> > > >
> > > >    ... 16 more
> > > >
> > > >
> > > > I do have the mysql-connector jar under the $HADOOP_HOME/lib folder
> on
> > > all
> > > > servers in the cluster, and even tried using the
> > > > DistributedCache.addArchiveToClassPath method, with no success.  Can
> > > someone
> > > > please help me figure out what is going on here?
> > > >
> > > > Here is my simple main which performs the remote submission of the
> job:
> > > > public int run(String[] arg0) throws Exception {
> > > >
> > > >        System.out.println("Setting up job configuration....");
> > > >        Configuration conf = new Configuration();
> > > >        conf.set("mapred.job.tracker", "jobtracker.hostname:8021");
> > > >        conf.set("fs.default.name", "hdfs://namenode.hostname:9000");
> > > >        conf.set("keep.failed.task.files", "true");
> > > >        conf.set("mapred.child.java.opts", "-Xmx1024m");
> > > >
> > > >        FileSystem fs = FileSystem.get(conf);
> > > >        fs.delete(new Path("/myfolder/dump_output/"), true);
> > > >        fs.mkdirs(new Path("/myfolder/libs/"));
> > > >
> > > >        fs.copyFromLocalFile(
> > > >                new Path(
> > > >
> > > >
> > >
> >
> "C:/Users/me/.m2/repository/org/mylib/0.1-SNAPSHOT/myproject-0.1-SNAPSHOT-hadoop.jar"),
> > > >
> > > >                new
> > > > Path("/myfolder/libs/myproject-0.1-SNAPSHOT-hadoop.jar"));
> > > >
> > > >
> > > >          fs.copyFromLocalFile( new Path(
> > > >
> > > >
> > >
> >
> "C:/Users/me/.m2/repository/mysql/mysql-connector-java/5.1.17/mysql-connector-java-5.1.17.jar"
> > > >
> > > >          ), new
> > Path("/myfolder/libs/mysql-connector-java-5.1.17.jar"));
> > > >
> > > >        DistributedCache.addArchiveToClassPath(new Path(
> > > >                "/myfolder/libs/myproject-0.1-SNAPSHOT-hadoop.jar"),
> > conf,
> > > > fs);
> > > >
> > > >        DistributedCache.addArchiveToClassPath(new Path(
> > > >                "/myfolder/libs/mysql-connector-java-5.1.17.jar"),
> conf,
> > > > fs);
> > > >
> > > >        JobConf job = new JobConf(conf);
> > > >
> > > >        job.setJobName("Exporting Job");
> > > >        job.setJarByClass(MyMapper.class);
> > > >        job.setMapperClass(MyMapper.class);
> > > >        Class claz = Class.forName("com.mysql.jdbc.Driver");
> > > >        if (claz == null) {
> > > >            throw new RuntimeException("wow...");
> > > >        }
> > > >
> > > >        Configuration.dumpConfiguration(conf, new
> > > PrintWriter(System.out));
> > > >
> > > >        DBConfiguration
> > > >                .configureDB(
> > > >                        job,
> > > >                        "com.mysql.jdbc.Driver",
> > > >
> > > > "jdbc:mysql://mydbserver:3306/test?autoReconnect=true",
> > > >                        "user", "password");
> > > >
> > > >        String[] fields = { "employee_id", "name" };
> > > >        DBInputFormat.setInput(job, MyRecord.class, "employees", null,
> > > >                "employee_id", fields);
> > > >
> > > >        FileOutputFormat.setOutputPath(job, new Path(
> > > >                "/myfolder/dump_output/"));
> > > >
> > > >        System.out.println("Submitting job....");
> > > >
> > > >        JobClient.runJob(job);
> > > >
> > > >        System.out.println("job info: " + job.getNumMapTasks());
> > > >
> > > >        return 0;
> > > >    }
> > > >
> > > >    public static void main(String[] args) throws Exception {
> > > >        int exitCode = ToolRunner.run(new SimpleDriver(), args);
> > > >        System.out.println("Completed.");
> > > >        System.exit(exitCode);
> > > >    }
> > > >
> > > >
> > > > I'm using the hadoop-core version 0.20.205.0 maven dependency to
> build
> > > and
> > > > run my program via eclipse. The myproject-0.1-SNAPSHOT-hadoop.jar jar
> > has
> > > my
> > > > classes, and it's dependencies included under the /lib folder.
> > > >
> > > > Any help would be greatly appreciated.
> > > >
> > > > Thanks
> > > >
> > >
> >
>
>
>
> --
> Steven M. Lewis PhD
> 4221 105th Ave NE
> Kirkland, WA 98033
> 206-384-1340 (cell)
> Skype lordjoe_com
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message