Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 71F0F18D8B for ; Mon, 8 Jun 2015 22:14:47 +0000 (UTC) Received: (qmail 46271 invoked by uid 500); 8 Jun 2015 22:14:35 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 46232 invoked by uid 500); 8 Jun 2015 22:14:35 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 46222 invoked by uid 99); 8 Jun 2015 22:14:35 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Jun 2015 22:14:35 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of danchia@coursera.org designates 209.85.220.179 as permitted sender) Received: from [209.85.220.179] (HELO mail-qk0-f179.google.com) (209.85.220.179) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Jun 2015 22:12:20 +0000 Received: by qkhg32 with SMTP id g32so400830qkh.0 for ; Mon, 08 Jun 2015 15:13:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=rrRvMSJf0LUnCjULFi+Zlu3SNhNZaDpn9iQ1ndAkOmg=; b=Y1k2EvCaXapq5C83kAj5GFg+AsPNBsJKilTbWhhhYR4gBNrR4GkKXK8KRQhTr/OUbC MN0Dg630c9prvKEHvwiMLppFOpuFjyH3UBTe/zC2Dmh5nOo+tdX6zNMuo5Ldlnk8qGDz yIMiXRw3Vl6F5usEI2O7NLL+xcLPu+yRlMkyM98u8/JWVI6aNLsp/eSsoJUCOezE0x/t WbW4kAixqD3SaDtetvUB9otqtN8bKZLIkEZGhJp82ctrB7qAiRmSKIzwCdwY4+EdRgE3 yvBiIRhEJG2Y269PGvUjrLpUwqt4ayr3Nv6pPtPZmHzfwpznl1Kk+DwZbFrqu3yOM65Q LWmg== X-Gm-Message-State: ALoCoQk7sGCxR9z9pA4nfRSwKZ0kAG44FhuvCtFZ1tx20P5cCxx/dzjzTdv0SuEpDwf5X9PYQ7xf X-Received: by 10.229.222.130 with SMTP id ig2mr16827646qcb.6.1433801603141; Mon, 08 Jun 2015 15:13:23 -0700 (PDT) MIME-Version: 1.0 Received: by 10.229.95.5 with HTTP; Mon, 8 Jun 2015 15:13:02 -0700 (PDT) In-Reply-To: References: From: Daniel Chia Date: Mon, 8 Jun 2015 15:13:02 -0700 Message-ID: Subject: Re: Deserialize the collection type data from the SSTable file To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=001a11343d043be947051808f3ca X-Virus-Checked: Checked by ClamAV on apache.org --001a11343d043be947051808f3ca Content-Type: text/plain; charset=UTF-8 I'm not sure why sstable2json doesn't work for collections, but if you're into reading raw sstables we use the following code with good success: https://github.com/coursera/aegisthus/blob/77c73f6259f2a30d3d8ca64578be5c13ecc4e6f4/aegisthus-hadoop/src/main/java/org/coursera/mapreducer/CQLMapper.java#L85 Thanks, Daniel On Mon, Jun 8, 2015 at 1:22 PM, java8964 wrote: > Hi, Cassandra users: > > I have a question related to how to Deserialize the new collection types > data in the Cassandra 2.x. (The exactly version is C 2.0.10). > > I create the following example tables in the CQLSH: > > CREATE TABLE coupon ( > account_id bigint, > campaign_id uuid, > ........................, > discount_info map, > ........................, > PRIMARY KEY (account_id, campaign_id) > ) > > The other columns can be ignored in this case. Then I inserted into the > one test data like this: > > insert into coupon (account_id, campaign_id, discount_info) values > (111,uuid(), {'test_key':'test_value'}); > > After this, I got the SSTable files. I use the sstable2json file to check > the output: > > $./resources/cassandra/bin/sstable2json /xxx/test-coupon-jb-1-Data.db > [ > {"key": "000000000000006f","columns": > [["0336e50d-21aa-4b3a-9f01-989a8c540e54:","",1433792922055000], > ["0336e50d-21aa-4b3a-9f01-989a8c540e54:discount_info","0336e50d-21aa-4b3a-9f01-989a8c540e54:discount_info:!",1433792922054999,"t",1433792922], > ["0336e50d-21aa-4b3a-9f01-989a8c540e54:discount_info:746573745f6b6579","746573745f76616c7565",1433792922055000]]} > ] > > What I want to is to get the {"test_key" : "test_value"} as key/value pair > that I input into "discount_info" column. I followed the sstable2json code, > and try to deserialize the data by myself, but to my surprise, I cannot > make it work, even I tried several ways, but kept getting Exception. > > From what I researched, I know that Cassandra put the "campaign_id" + > "discount_info" + "Another ByteBuffer" as composite column in this case. > When I deserialize this columnName, I got the following dumped out as > String: > > "0336e50d-21aa-4b3a-9f01-989a8c540e54:discount_info:746573745f6b6579". > > It includes 3 parts: the first part is the uuid for the campaign_id. The > 2nd part as "discount_info", which is the static name I defined in the > table. The 3 part is a bytes array as length of 46, which I am not sure > what it is. > > The corresponding value part of this composite column is another byte > array as length of 10, hex as "746573745f76616c7565" if I dump it out. > > Now, here is what I did and not sure why it doesn't work. > First, I assume the value part stores the real value I put in the Map, so > I did the following: > > ByteBuffer value = ByteBufferUtil.clone(column.value()); > > MapType result = MapType.getInstance(UTF8Type.instance, UTF8Type.instance); > Map output = result.compose(value); > > // it gave me the following exception: org.apache.cassandra.serializers.MarshalException: Not enough bytes to read a map > > Then I am think that the real value must be stored as part of the column names (the 3rd part of 46 bytes), so I did this: > > MapType result = MapType.getInstance(UTF8Type.instance, UTF8Type.instance); > Map output = result.compose(third_part.value); > > // I got the following exception: > > java.lang.IllegalArgumentException > at java.nio.Buffer.limit(Buffer.java:267) > at org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:587) > at org.apache.cassandra.utils.ByteBufferUtil.readBytesWithShortLength(ByteBufferUtil.java:596) > at org.apache.cassandra.serializers.MapSerializer.deserialize(MapSerializer.java:63) > at org.apache.cassandra.serializers.MapSerializer.deserialize(MapSerializer.java:28) > at org.apache.cassandra.db.marshal.AbstractType.compose(AbstractType.java:142) > > > I can get all other non-collection types data, but I cannot get the data from the Map. My questions are: > > 1) How does the Cassandra store the collection data in the SSTable files? From the length of bytes, it is most likely as part of the composite column. If so, why I got the exception as above? > > 2) The sstable2json doesn't deserialize the real data out from the collection type. So I don't have an example to follow. Do I use the wrong way trying to compose the Map type data? > > > Thanks > > > Yong > > --001a11343d043be947051808f3ca Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
I'm not sure why sstable2json doesn't work for col= lections, but if you're into reading raw sstables we use the following = code with good success:


Thanks,
Daniel

On Mon, Jun 8, 2015 at 1:22 PM, java8964 <java8964@hotmail.com> wrote:
Hi, Cassandra users:

I have a ques= tion related to how to Deserialize the new collection types data in the Cas= sandra 2.x. (The exactly version is C 2.0.10).

I c= reate the following example tables in the CQLSH:

<= div>CREATE TABLE coupon (
=C2=A0 account_id bigint,
=C2= =A0 campaign_id uuid,
=C2=A0 ........................,
= =C2=A0 discount_info map<text, text>,<= /span>
=C2=A0 ....................= ....,
=C2=A0 PRIMARY KEY (a= ccount_id, campaign_id)
)

T= he other columns can be ignored in this case. Then I inserted into the one = test data like this:

insert into coupon (account_i= d, campaign_id, discount_info) values (111,uuid(), {'test_key':'= ;test_value'});

After this, I got the SSTable = files. I use the sstable2json file to check the output:

$./resources/cassandra/bin/ssta= ble2json /xxx/test-coupon-jb-1-Data.db
[
{"key&quo= t;: "000000000000006f","columns": [["0336e50d-21aa= -4b3a-9f01-989a8c540e54:","",1433792922055000], ["0336e= 50d-21aa-4b3a-9f01-989a8c540e54:discount_info","0336e50d-21aa-4b3= a-9f01-989a8c540e54:discount_info:!",1433792922054999,"t",14= 33792922], ["0336e50d-21aa-4b3a-9f01-989a8c540e54:discount_info:746573= 745f6b6579","746573745f76616c7565",1433792922055000]]}
=
]=C2=A0

What = I want to is to get the {"test_key" : "test_value"} as = key/value pair that I input into "discount_info" column. I follow= ed the sstable2json code, and try to deserialize the data by myself, but to= my surprise, I cannot make it work, even I tried several ways, but kept ge= tting Exception.

From what I researched, I know that Cassandra put the "cam= paign_id" + "discount_info" + "Another ByteBuffer"= as composite column in this case. When I deserialize this columnName, I go= t the following dumped out as String:

"0336e5= 0d-21aa-4b3a-9f01-989a8c540e54:discount_info:746573745f6b6579".
<= div>
It includes 3 parts: the first part is the uuid for the = campaign_id. The 2nd part as "discount_info", which is the static= name I defined in the table. The 3 part is a bytes array as length of 46, = which I am not sure what it is.=C2=A0

The correspo= nding value part of this composite column is another byte array as length o= f 10, hex as "746573745f76616c7565" if I dump it out.
<= br>
Now, here is what I did and not sure why it doesn't work.= =C2=A0
First, I assume the value part stores the real value I put= in the Map, so I did the following:

ByteBuffer va= lue =3D ByteBufferUtil.clone(column.value());
MapType<S=
tring, String> result =3D MapType.getI=
nstance(UTF8Type.instance, UTF8Type.instance);
Map<String, String>= ; output =3D result.compose(value);
// it gave me the following=
 exception: org.apache.cassandra.serializers.MarshalException: Not enough b=
ytes to read a map
Then I am think that the real value must be =
stored as part of the column names (the 3rd part of 46 bytes), so I did thi=
s:
MapType<String, String> result =3D MapT=
ype.getInstance(UTF8Type.instance, UT=
F8Type.ins=
tance);
Map<String, String> output =3D result.compose(third= _part.value);
// I got the following except= ion:
java.lang.IllegalArgumentException
	at java.nio.Buffer.limit(Buffer.java:267)
	at org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java=
:587)
	at org.apache.cassandra.utils.ByteBufferUtil.readBytesWithShortLength(Byte=
BufferUtil.java:596)
	at org.apache.cassandra.serializers.MapSerializer.deserialize(MapSerialize=
r.java:63)
	at org.apache.cassandra.serializers.MapSerializer.deserialize(MapSerialize=
r.java:28)
	at org.apache.cassandra.db.marshal.AbstractType.compose(AbstractType.java:=
142)

I can get all other non-collection types data, but I cannot get =
the data from the Map. My questions are:
1) How does the Cassandra store the collection data in the SSTable file=
s? From the length of bytes, it is most likely as part of the composite col=
umn. If so, why I got the exception as above? 
2) The sstable2json doesn't deserialize the real data out fr=
om the collection type. So I don't have an example to follow. Do I use =
the wrong way trying to compose the Map type data?

Thanks

Yong<=
/font>

--001a11343d043be947051808f3ca--