hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shrijeet Paliwal <shrij...@rocketfuel.com>
Subject Re: Specifying the InputFormat class that exists in a JAR on the hdfs
Date Wed, 13 Oct 2010 23:21:39 GMT
How about adding it to HADOOP_CLASSPATH if not already.

On Wed, Oct 13, 2010 at 4:15 PM, Michael Moores <mmoores@real.com> wrote:

> fyi- I also tried thr archive version--
>
> calling DistributedCache.addArchiveToClassPath(path, configuration);
>
> On Oct 13, 2010, at 4:12 PM, Michael Moores wrote:
>
> > I have specified my InputFormat to be the cassandra
> ColumnFamilyInputFormat, and also
> > added the cassandra JAR to my classpath via a call to
> DistributedCache.addFileToClassPath().
> > The JAR exists on the HDFS.
> > When I run my jar I get  java.lang.NoClassDefFoundError:
> org/apache/cassandra/hadoop/ColumnFamilyInputFormat at the line that
> > makes the job.setInputFormatClass() call.
> >
> > I execute the job with "hadoop jar <myjar>".
> >
> > Will I need to put my cassandra JAR on each machine and add it to the JVM
> startup options???
> >
> > Here is a code snippet:
> >
> > public class MyStats extends Configured implements Tool {
> > ...
> >   public static void main(String[] args) throws Exception {
> >        // Let ToolRunner handle generic command-line options
> >        Configuration configuration = new Configuration();
> >        Path path = new
> Path("/user/hadoop/profilestats/cassandra-0.7.0-beta2.jar");
> >        log.info("main: adding jars...");
> >        DistributedCache.addFileToClassPath(path, configuration);
> >
> >
> >
> >
> >
> >        ToolRunner.run(configuration, new MyStats(), args);
> >        System.exit(0);
> >    }
> >
> >   public int run(String[] args) throws Exception {
> >      Job job = new Job(getConf(), "myjob");
> >
>  job.setInputFormatClass(org.apache.cassandra.hadoop.ColumnFamilyInputFormat.class);
> >      ..
> >      job.waitForCompletion(true);
> >   }
> >
> >
> > FILE LISTING from HDFS:
> >
> > [hadoop@kv-app02 ~]$ hadoop dfs -lsr
> > 10/10/13 14:57:47 INFO security.Groups: Group mapping
> impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
> cacheTimeout=300000
> > 10/10/13 14:57:48 WARN conf.Configuration: mapred.task.id is deprecated.
> Instead, use mapreduce.task.attempt.id
> > drwxr-xr-x   - hadoop supergroup          0 2010-10-13 14:34
> /user/hadoop/profilestats
> > -rw-r--r--   3 hadoop supergroup    1841467 2010-10-13 14:34
> /user/hadoop/profilestats/cassandra-0.7.0-beta2.jar
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message