Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 47770 invoked from network); 14 Oct 2010 20:20:05 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 14 Oct 2010 20:20:05 -0000 Received: (qmail 12570 invoked by uid 500); 14 Oct 2010 20:20:03 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 12549 invoked by uid 500); 14 Oct 2010 20:20:03 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 12539 invoked by uid 99); 14 Oct 2010 20:20:03 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Oct 2010 20:20:03 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of mmoores@real.com designates 207.188.23.4 as permitted sender) Received: from [207.188.23.4] (HELO kal-el.real.com) (207.188.23.4) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Oct 2010 20:19:55 +0000 Received: from seacas01.corp.real.com ([::ffff:192.168.139.56]) (TLS: TLSv1/SSLv3,128bits,AES128-SHA) by kal-el.real.com with esmtp; Thu, 14 Oct 2010 13:19:34 -0700 id 000807C2.4CB765D6.00004915 Received: from seambx.corp.real.com ([fe80::2d15:fda7:b3b8:e268]) by seacas01.corp.real.com ([192.168.139.56]) with mapi; Thu, 14 Oct 2010 13:19:33 -0700 From: Michael Moores To: "user@cassandra.apache.org" Date: Thu, 14 Oct 2010 13:19:32 -0700 Subject: Re: 0.7.0-beta2 and Hadoop Thread-Topic: 0.7.0-beta2 and Hadoop Thread-Index: Actr3R1QI+46ETYhTQ+g9C4EVo3E7Q== Message-ID: <8C601B96-2641-41DE-94E9-094A9AE2D020@real.com> References: <2B78983B-BABE-4E87-B34D-78C3C6AF7CD1@real.com> <9851EDB1-9DB6-43D7-9C91-53110C09B1DD@real.com> In-Reply-To: <9851EDB1-9DB6-43D7-9C91-53110C09B1DD@real.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: multipart/alternative; boundary="_000_8C601B96264141DE94E9094A9AE2D020realcom_" MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org --_000_8C601B96264141DE94E9094A9AE2D020realcom_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable I SOLVED the problem. It was my misunderstanding of how the cassandra connection is being used fo= r calling getSlices(). On Oct 14, 2010, at 10:06 AM, Michael Moores wrote: Ok I moved back to hadoop 20.2 and the WordCount example is doing better. But I am still seeing a problem, that may be due to my lack of experience w= / hadoop. I am running "hadoop jar..." on my JobTracker/NameNode machine, which is no= t running Cassandra. I have DataNode/TaskTracker running on all cassandra nodes, with my ConfigH= elper set up to talk to cassandra on localhost. When I run the job, I see it can't connect: (I renamed the main class to "= ProfileStats") [hadoop@kv-app01 test]$ hadoop jar hadoop-cassandra-0.0.1-SNAPSHOT.jar com.= real.uds.hadoop.ProfileStats xyz -libjars ./cassandra-0.7.0-beta2.jar ./lib= thrift-r959516.jar 10/10/14 09:57:57 INFO hadoop.ProfileStats: main: adding jars... 10/10/14 09:57:58 INFO hadoop.ProfileStats: output reducer type: filesystem 10/10/14 09:57:58 INFO hadoop.ProfileStats: main: adding jars AGAIN... Exception in thread "main" java.io.IOException: unable to connect to server at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.createConnec= tion(ColumnFamilyInputFormat.java:205) .. Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) Should I expect my job to be executed on the TaskTracker nodes? On Oct 13, 2010, at 5:39 PM, Michael Moores wrote: What version of hadoop should i be using with cassandra 0.7.0-beta2? I am using the latest version 21.0. Just running a modified version of the WordCount example: https://svn.apache.org/repos/asf/cassandra/trunk/contrib/word_count/src/ I get a linkage error thrown from the getSplits method. Exception in thread "main" java.lang.IncompatibleClassChangeError: Found in= terface org.apache.hadoop.mapreduce.JobContext, but class was expected at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(Co= lumnFamilyInputFormat.java:88) at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmi= tter.java:401) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitte= r.java:418) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSu= bmitter.java:338) at org.apache.hadoop.mapreduce.Job.submit(Job.java:960) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:976) --_000_8C601B96264141DE94E9094A9AE2D020realcom_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable
I SOLVED the problem.=
It was my misunderstanding of how the cassandra connection is be= ing used for calling getSlices().

On Oct 14, 2010, at 10= :06 AM, Michael Moores wrote:

=
Ok I moved back= to hadoop 20.2 and the WordCount example is doing better.
But I = am still seeing a problem, that may be due to my lack of experience w/ hado= op.
I am running "hadoop jar..." on my JobTracker/NameNode machin= e, which is not running Cassandra.
I have DataNode/TaskTracker ru= nning on all cassandra nodes, with my ConfigHelper set up to talk to cassan= dra on localhost.
When I run the job, I see it can't connect: &nb= sp;(I renamed the main class to "ProfileStats")

[hadoop@kv-app01 test]$ hadoop jar hadoop-cassandra-0.0.1-SNAPSHOT.jar c= om.real.uds.hadoop.ProfileStats xyz -libjars ./cassandra-0.7.0-beta2.jar ./= libthrift-r959516.jar 
10/10/14 09:57:57 INFO hadoop.Profile= Stats: main: adding jars...
10/10/14 09:57:58 INFO hadoop.Profile= Stats: output reducer type: filesystem
10/10/14 09:57:58 INFO had= oop.ProfileStats: main: adding jars AGAIN...
Exception in thread = "main" java.io.IOException: unable to connect to server
 &nb= sp;      at org.apache.cassandra.hadoop.ColumnFamilyInputFor= mat.createConnection(ColumnFamilyInputFormat.java:205)
..
Caused by: java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketCo= nnect(Native Method)

Should I expect my job = to be executed on the TaskTracker nodes?  


On Oct 13, 2010, at 5:39 PM, Michael Moores wrote:
What version of hadoop should i be using with c= assandra 0.7.0-beta2?
I am using the latest version 21.0.

Just running a modified version of the WordCount example:=

I get a li= nkage error thrown from the getSplits method.

= Exception in thread "main" java.lang.IncompatibleClassChangeError: Found in= terface org.apache.hadoop.mapreduce.JobContext, but class was expected
        at org.apache.cassandra.hadoop.Colum= nFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:88)
&nb= sp;       at org.apache.hadoop.mapreduce.JobSubmitter.w= riteNewSplits(JobSubmitter.java:401)
       &= nbsp;at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.j= ava:418)
        at org.apache.hadoop.ma= preduce.JobSubmitter.submitJobInternal(JobSubmitter.java:338)
&nb= sp;       at org.apache.hadoop.mapreduce.Job.submit(Job= .java:960)
        at org.apache.hadoop.= mapreduce.Job.waitForCompletion(Job.java:976)


= --_000_8C601B96264141DE94E9094A9AE2D020realcom_--