Return-Path: Delivered-To: apmail-hadoop-general-archive@minotaur.apache.org Received: (qmail 29417 invoked from network); 13 Oct 2010 23:22:06 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 13 Oct 2010 23:22:06 -0000 Received: (qmail 89092 invoked by uid 500); 13 Oct 2010 23:22:05 -0000 Delivered-To: apmail-hadoop-general-archive@hadoop.apache.org Received: (qmail 89044 invoked by uid 500); 13 Oct 2010 23:22:05 -0000 Mailing-List: contact general-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@hadoop.apache.org Delivered-To: mailing list general@hadoop.apache.org Received: (qmail 89036 invoked by uid 99); 13 Oct 2010 23:22:05 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Oct 2010 23:22:05 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of shrijeet@rocketfuelinc.com designates 209.85.216.48 as permitted sender) Received: from [209.85.216.48] (HELO mail-qw0-f48.google.com) (209.85.216.48) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Oct 2010 23:22:00 +0000 Received: by qwf7 with SMTP id 7so1493192qwf.35 for ; Wed, 13 Oct 2010 16:21:39 -0700 (PDT) MIME-Version: 1.0 Received: by 10.229.236.133 with SMTP id kk5mr8136069qcb.191.1287012099582; Wed, 13 Oct 2010 16:21:39 -0700 (PDT) Received: by 10.229.250.21 with HTTP; Wed, 13 Oct 2010 16:21:39 -0700 (PDT) In-Reply-To: <2EB7C48D-7640-4B76-8135-9120E39031A6@real.com> References: <2EB7C48D-7640-4B76-8135-9120E39031A6@real.com> Date: Wed, 13 Oct 2010 16:21:39 -0700 Message-ID: Subject: Re: Specifying the InputFormat class that exists in a JAR on the hdfs From: Shrijeet Paliwal To: general@hadoop.apache.org Content-Type: multipart/alternative; boundary=0016e64caedc047264049287db65 --0016e64caedc047264049287db65 Content-Type: text/plain; charset=ISO-8859-1 How about adding it to HADOOP_CLASSPATH if not already. On Wed, Oct 13, 2010 at 4:15 PM, Michael Moores wrote: > fyi- I also tried thr archive version-- > > calling DistributedCache.addArchiveToClassPath(path, configuration); > > On Oct 13, 2010, at 4:12 PM, Michael Moores wrote: > > > I have specified my InputFormat to be the cassandra > ColumnFamilyInputFormat, and also > > added the cassandra JAR to my classpath via a call to > DistributedCache.addFileToClassPath(). > > The JAR exists on the HDFS. > > When I run my jar I get java.lang.NoClassDefFoundError: > org/apache/cassandra/hadoop/ColumnFamilyInputFormat at the line that > > makes the job.setInputFormatClass() call. > > > > I execute the job with "hadoop jar ". > > > > Will I need to put my cassandra JAR on each machine and add it to the JVM > startup options??? > > > > Here is a code snippet: > > > > public class MyStats extends Configured implements Tool { > > ... > > public static void main(String[] args) throws Exception { > > // Let ToolRunner handle generic command-line options > > Configuration configuration = new Configuration(); > > Path path = new > Path("/user/hadoop/profilestats/cassandra-0.7.0-beta2.jar"); > > log.info("main: adding jars..."); > > DistributedCache.addFileToClassPath(path, configuration); > > > > > > > > > > > > ToolRunner.run(configuration, new MyStats(), args); > > System.exit(0); > > } > > > > public int run(String[] args) throws Exception { > > Job job = new Job(getConf(), "myjob"); > > > job.setInputFormatClass(org.apache.cassandra.hadoop.ColumnFamilyInputFormat.class); > > .. > > job.waitForCompletion(true); > > } > > > > > > FILE LISTING from HDFS: > > > > [hadoop@kv-app02 ~]$ hadoop dfs -lsr > > 10/10/13 14:57:47 INFO security.Groups: Group mapping > impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; > cacheTimeout=300000 > > 10/10/13 14:57:48 WARN conf.Configuration: mapred.task.id is deprecated. > Instead, use mapreduce.task.attempt.id > > drwxr-xr-x - hadoop supergroup 0 2010-10-13 14:34 > /user/hadoop/profilestats > > -rw-r--r-- 3 hadoop supergroup 1841467 2010-10-13 14:34 > /user/hadoop/profilestats/cassandra-0.7.0-beta2.jar > > --0016e64caedc047264049287db65--