cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Hanna <>
Subject Re: ColumnFamilyRecordWriter
Date Mon, 28 Feb 2011 17:37:21 GMT
One thing that could be done is the CFRW could be abstracted more so that it's easier to extend
and only the serialization mechanism is required to extend it.  That is, all of the core functionality
relating to Cassandra would be in an abstract class or something like that.  Then the avro
based one could extend that with things specific to avro.  That way people could write their
own CFRW extension with whatever serialization they chose.  Anyway, that seems reasonable,
but would take some work - if you'd like to look at that, I could help as I had time.

On Feb 28, 2011, at 10:19 AM, Jeremy Hanna wrote:

> There certainly could be a thrift based record writer.  However, (if I remember correctly)
to enable Hadoop output streaming, it was easier to go with Avro for doing the records as
the schema is included.  There could also have been a thrift version of the record writer,
but it's simpler to just have one record writer.  That was the decision process at least.
> If there is a compelling reason or a lot of demand for a thrift based one, maybe it could
be revisited - though I'm not the one making that decision.
> On Feb 28, 2011, at 4:10 AM, Mayank Mishra wrote:
>> Hi all,
>> As I was integrating Hadoop with Cassandra, I wanted to serialize mutations, hence
I used thrift mutations in M/R jobs.
>> During the course, I came to know that CFRW considers only Avro mutations. Can someone
please explain me why only avro transport is entertained by CFRW. Why not, both thrift and
avro mutations are considered?
>> Please let me know if I missed some important point.
>> With regards,
>> Mayank

View raw message