incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: Cassandra 1.20 with Cloudera Hadoop (CDH4) Compatibility Issue
Date Sat, 16 Feb 2013 14:23:24 GMT
Here is the deal.

http://wiki.apache.org/hadoop/Defining%20Hadoop

INAPPROPRIATE: Automotive Joe's Crankshaft: 100% compatible with Hadoop

Bad, because "100% compatible" is a meaningless statement. Even Apache
releases have regressions; cases were versions are incompatible *even
when the Java interfaces don't change*. A statement about
compatibility ought to be qualilified "Certified by Joe's brother Bob
as 100% compatible with Apache Hadoop(TM)". In the US, the marketing
team may be able to get way with the "100% compatible" claim, but in
some EU countries, sticking that statement up your web site is a claim
that residents can demand the vendor justifies, or take it down.

So as a result, if you are running something NOT apache hadoop, CDH,
DSE, or whatever they are NOT compatible with hadoop or each other by
definition.

Anyway, I have been using hadoop for years, and its biggest problem is
that it has never become happy with its own codebase. Old api, new
api, jobtracker, yarn, all these thing change, there is really no
upgrade/downgrade path because there are so many branches etc.Open
source products move swiftly and end users are normally left holding
the ball in figuring it our how to do it sanely. With Cassandra +
Hadoop it is "double trouble".

All that being said I think it is unrealistic to count on vendors 100%
to solve your problems. If something throws you and exception like..

org.apache.cassandra.hadoop.ConfigHelper.setRpcPort(Lorg/apache/hadoop/conf/Configuration;Ljava/lang/String;)V

Guess what? It is time to get out your compiler.

On Sat, Feb 16, 2013 at 3:39 AM, Yang Song <xfilter@gmail.com> wrote:
> Thanks Michael. I attached the reply I got back from CDH4 user group from
> Harsh. Hope to share the experience.
> "
> In CDH4, the MR1 and MR2 APIs are both fully compatible (such that
> moving to YARN in future would require no recompilation from MR1
> produced jars). You can consider it "2.0" API in binary form, and not
> 0.20 exactly (i.e. its not backwards compatible with CDH3).
>
> Cassandra is distributing binaries built on MR1 (Apache Hadoop 1,
> CDH3, etc.), which wouldn't work on your CDH4 platform. You will have
> to recompile against the proper platform to get binary-compatible
> jars/etc.."
>
> Interesting. Has anyone have issue with CDH4 with the newly released C*
> 1.21?
>
> Thanks
>
> 2013/2/15 Michael Kjellman <mkjellman@barracuda.com>
>>
>> Sorry. I meant to say even though there *wasnt* a major change between
>> 1.0.x and 0.22. The big change was 0.20 to 0.22. Sorry for the confusion.
>>
>> On Feb 15, 2013, at 9:53 PM, "Michael Kjellman" <mkjellman@barracuda.com>
>> wrote:
>>
>> There were pretty big changes in Hadoop between 0.20 and 0.22 (which is
>> now known as 1.0.x) even though there were major change between 0.22 and
>> 1.0.x. Cloudera hadn't yet upgraded to 0.22 which uses the new map reduce
>> framework instead of the old mapred API. I don't see the C* project back
>> porting their code at this time and if anything Cloudera should update their
>> release!!
>>
>> On Feb 15, 2013, at 9:48 PM, "Yang Song" <xfilter@gmail.com> wrote:
>>
>> It is interesting though. I am using CDH4 which contains hadoop 0.20, and
>> I am using Cassandra 1.20.
>> The previous mentioned errors still occur. Any suggestions? Thanks.
>>
>> 2013/2/15 Michael Kjellman <mkjellman@barracuda.com>
>>>
>>> That bug is kinda wrong though. 1.0.x is current for like a year now and
>>> C* works great with it :)
>>>
>>> On Feb 15, 2013, at 7:38 PM, "Dave Brosius" <dbrosius@mebigfatguy.com>
>>> wrote:
>>>
>>> see https://issues.apache.org/jira/browse/CASSANDRA-5201
>>>
>>>
>>> On 02/15/2013 10:05 PM, Yang Song wrote:
>>>
>>> Hi,
>>>
>>> Does anyone use CDH4's Hadoop with Cassandra to interact? The goal is
>>> simply read/write to Cassandra from Hadoop direclty using
>>> ColumnFamilyInput(Output)Format, but seems a bit compatibility issue. There
>>> are two java exceptions
>>>
>>> 1. java.lang.IncompatibleClassChangeError: Found interface
>>> org.apache.hadoop.mapreduce.JobContext, but class was expected
>>> This shows when I run hadoop jar file to read directly from Cassandra.
>>> Seems that there is a change on Hadoop that JobContext was changed from
>>> class to interface. Has anyone have similar issue?
>>> Does it mean the Hadoop version in CDH4 is old?
>>>
>>> 2. Another error is java.lang.NoSuchMethodError:
>>> org.apache.cassandra.hadoop.ConfigHelper.setRpcPort(Lorg/apache/hadoop/conf/Configuration;Ljava/lang/String;)V
>>> This shows when the jar file contains rpc port for remote Cassandra
>>> cluster.
>>>
>>> Does anyone have similiar experience? Any comments are welcome. thanks!
>>>
>>>
>>
>

Mime
View raw message