cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From java8964 <>
Subject RE: Deserialize the collection type data from the SSTable file
Date Thu, 24 Sep 2015 15:33:39 GMT
Hi, Daniel:
I didn't find any branch related to C* 2.1 in the, is
there one?
It looks like there are big changes in the C* 2.1 of collections API. Just want to know if
there is any CQLMapper for C* 2.1 branch. Meantime, I will also try to understand more about
the new changes in C* 2.1

Date: Wed, 10 Jun 2015 09:11:02 -0700
Subject: Re: Deserialize the collection type data from the SSTable file

Hi Yong,
Glad the code was helpful. I believe it serializes using List<Pair<ByteBuffer, Column>>
for maps so that it can store the Key of the map as well.
Thanks for pointing out the edge case!Thanks,Daniel

On Wed, Jun 10, 2015 at 6:39 AM, java8964 <> wrote:

Thanks, Daniel.
I didn't realize that Cassandra will serialize one more way using List<Pair<ByteBuffer,
Column>> for collection types. Reading your example code, I make it work.
>From link you gave me, using my test data, I found out one issue though. In any one row,
if the collection column is NULL, I think the logic of code will throw NullPointException
on line 148:
Line 145    private void addValue(GenericRecord record, CFDefinition.Name name, ColumnGroupMap
Line 146        if (name.type.isCollection()) {
Line 147            List<Pair<ByteBuffer, Column>> collection = group.getCollection(;
Line 148            ByteBuffer buffer = ((CollectionType)name.type).serialize(collection);
            addCqlCollectionToRecord(record, name, buffer);
If the collection column in that row is NULL, then Line 147 will return NULL, which will cause
the following exception:
Exception in thread "main" java.lang.NullPointerException	at org.apache.cassandra.db.marshal.CollectionType.enforceLimit(
at org.apache.cassandra.db.marshal.ListType.serialize(
So I need to add a check to avoid that, as any regular columns in Cassandra could just have
NULL value.
Date: Mon, 8 Jun 2015 15:13:02 -0700
Subject: Re: Deserialize the collection type data from the SSTable file

I'm not sure why sstable2json doesn't work for collections, but if you're into reading raw
sstables we use the following code with good success:

On Mon, Jun 8, 2015 at 1:22 PM, java8964 <> wrote:

Hi, Cassandra users:
I have a question related to how to Deserialize the new collection types data in the Cassandra
2.x. (The exactly version is C 2.0.10).
I create the following example tables in the CQLSH:
CREATE TABLE coupon (  account_id bigint,  campaign_id uuid,  ........................,  discount_info
map<text, text>,  ........................,  PRIMARY KEY (account_id, campaign_id))
The other columns can be ignored in this case. Then I inserted into the one test data like
insert into coupon (account_id, campaign_id, discount_info) values (111,uuid(), {'test_key':'test_value'});
After this, I got the SSTable files. I use the sstable2json file to check the output:
$./resources/cassandra/bin/sstable2json /xxx/test-coupon-jb-1-Data.db[{"key": "000000000000006f","columns":
[["0336e50d-21aa-4b3a-9f01-989a8c540e54:","",1433792922055000], ["0336e50d-21aa-4b3a-9f01-989a8c540e54:discount_info","0336e50d-21aa-4b3a-9f01-989a8c540e54:discount_info:!",1433792922054999,"t",1433792922],

What I want to is to get the {"test_key" : "test_value"} as key/value pair that I input into
"discount_info" column. I followed the sstable2json code, and try to deserialize the data
by myself, but to my surprise, I cannot make it work, even I tried several ways, but kept
getting Exception.
>From what I researched, I know that Cassandra put the "campaign_id" + "discount_info"
+ "Another ByteBuffer" as composite column in this case. When I deserialize this columnName,
I got the following dumped out as String:
It includes 3 parts: the first part is the uuid for the campaign_id. The 2nd part as "discount_info",
which is the static name I defined in the table. The 3 part is a bytes array as length of
46, which I am not sure what it is. 
The corresponding value part of this composite column is another byte array as length of 10,
hex as "746573745f76616c7565" if I dump it out.
Now, here is what I did and not sure why it doesn't work. First, I assume the value part stores
the real value I put in the Map, so I did the following:
ByteBuffer value = ByteBufferUtil.clone(column.value());MapType<String, String> result
= MapType.getInstance(UTF8Type.instance, UTF8Type.instance);
Map<String, String> output = result.compose(value);// it gave me the following exception:
org.apache.cassandra.serializers.MarshalException: Not enough bytes to read a mapThen I am
think that the real value must be stored as part of the column names (the 3rd part of 46 bytes),
so I did this:MapType<String, String> result = MapType.getInstance(UTF8Type.instance,
Map<String, String> output = result.compose(third_part.value);// I got the following
	at java.nio.Buffer.limit(
	at org.apache.cassandra.utils.ByteBufferUtil.readBytes(
	at org.apache.cassandra.utils.ByteBufferUtil.readBytesWithShortLength(
	at org.apache.cassandra.serializers.MapSerializer.deserialize(
	at org.apache.cassandra.serializers.MapSerializer.deserialize(
	at org.apache.cassandra.db.marshal.AbstractType.compose(
I can get all other non-collection types data, but I cannot get the data from the Map. My
questions are:1) How does the Cassandra store the collection data in the SSTable files? From
the length of bytes, it is most likely as part of the composite column. If so, why I got the
exception as above? 2) The sstable2json doesn't deserialize the real data out from the collection
type. So I don't have an example to follow. Do I use the wrong way trying to compose the Map
type data?


View raw message