Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A77021480 for ; Tue, 26 Apr 2011 05:21:37 +0000 (UTC) Received: (qmail 86996 invoked by uid 500); 26 Apr 2011 05:21:34 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 86966 invoked by uid 500); 26 Apr 2011 05:21:34 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 86958 invoked by uid 99); 26 Apr 2011 05:21:33 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Apr 2011 05:21:33 +0000 X-ASF-Spam-Status: No, hits=3.3 required=5.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_NONE,SPF_PASS,TRACKER_ID X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a43.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Apr 2011 05:21:24 +0000 Received: from homiemail-a43.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a43.g.dreamhost.com (Postfix) with ESMTP id 4BEE18C05F for ; Mon, 25 Apr 2011 22:21:00 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; q=dns; s=thelastpickle.com; b=0wws/RHXaZ 21MV7qL77xyo3sNXwtEmVgGIf6vk40aatT4Zmq0tGpys2ThSuKDpIPPxi2y0Tg+l PWoOJ2KC37ZnzTwKKO2nsbhNlmu4+gch4pXpq2QEsSKM0OkcaHptebzrtmY4n1f2 wiaHMo9AqeARxmRXAo9x6s6JgXnOXymwE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; s=thelastpickle.com; bh=4lleuGpCBNvOxi3/ +TTmU/xCyTA=; b=EQMu3WNPyvuHH0IW+erYjKzDC71XOoHeRMmStf6VD3/nigIK v2GtbU9dTrsWJY52BolL+81O8mhBINukv2Ouhc2Rf5Tffgk1nuiwxPTBhlzKpg89 lQbwd6c3Lm7QoQBdaOZ0u/kQlcVSToUIapJdum/lhtWMKLhhDujOklPOdVs= Received: from [10.0.1.155] (121-73-157-230.cable.telstraclear.net [121.73.157.230]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a43.g.dreamhost.com (Postfix) with ESMTPSA id D2E968C05D for ; Mon, 25 Apr 2011 22:20:57 -0700 (PDT) From: aaron morton Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: multipart/alternative; boundary=Apple-Mail-5--929460600 Subject: Re: Problems with subcolumn retrieval after upgrade from 0.6 to 0.7 Date: Tue, 26 Apr 2011 17:20:53 +1200 In-Reply-To: To: user@cassandra.apache.org References: <56163278-BEDC-4D1E-A7E8-0D8DBA86C332@thelastpickle.com> <41805FD3-2D11-440E-9715-C9F0ED09F72D@thelastpickle.com> Message-Id: <0E774402-6B5E-4194-81D4-E97E11615431@thelastpickle.com> X-Mailer: Apple Mail (2.1084) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail-5--929460600 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Some followup from discussions on IRC... Problem here was changing the sort order for sub columns. Cassandra = expects that the data on disk is correctly ordered.=20 We put together this hack https://gist.github.com/941445 to force the = sub columns to be re-ordered when read, and then ran nodetool scrub to = re-write the data.=20 Last I heard the tests went OK.=20 Aaron On 23 Apr 2011, at 05:14, Abraham Sanderson wrote: > I did some more sleuth work and found out what's going on. The 0.6 = data was serialized using the wrong compare method, and as a result, = when importing data into 0.7, the presorted wrapper for the subcolumns = would misbehave on some operations(like remove()). The supercolumn = operation gets all the columns then filters down to the requested column = list/range. The remove() fails because the underlying map for column = names/IColumn does not start at the same place the comparator expects it = to. The operation looks at the first element, and if the comparator = returns -1 for compareTo(key, firstKey) it assumes the key is not found = in the range. >=20 > The reason this happened appears to be that there was a typo in my = 0.6 keyspace definition...instead of CompareSubcolumnsWith(small c) I = used the key CompareSubColumnsWith(big C). Doh! I'm assuming that there = is no validation check on the keys in the storage conf file, and since = subcolumn comparator is optional, the file loaded just fine. As a = result my subcolumn comparator was ignored and the BytesType was used = instead. So the big question is how the heck do I fix it? This would = be similar to a case where I change the type of a column, say switch = from BytesType to AsciiType in order to sort it in a different fashion. >=20 > On Tue, Apr 19, 2011 at 4:38 PM, Abraham Sanderson = wrote: > Aaron, >=20 > I'll try my best...I'm still trying to make heads or tails of the = output as well. The first line is debugging output from me; just = printing the values for key, supercolumn name, and the wrapper class = I've built for the subcolumn. This was prior to 0.7.1, so the key is a = String e.g. "80324d09-302b-4093-9708-e509091e5d8"; the supercolumn is a = custom serializable object type, the first byte "0F" is for the type, = the rest of the sequence "AC ED ... 44 45" is byte array which is = backing an ObjectOutputStream; the subcolumn is my own construct, the = name in this case is the custom uuid type, represented by the type byte = "10" and then the bytes of the UUID "78 CF D5 25 A5 20 45 8E 85 84 25 94 = 15 B8 84 05". I then did a print of the column parent and predicate = being used for the get_slice command, just to be sure that everything = matches. The methods for that are part of cassandra code. Then I do a = print of the ColumnOrSuperColumn returned by the slice command. It = looks to me like not all the bytes are shown in some of the cases...I = looked up the thrift source code used by cassandra and it looks like = what is displayed is the ByteBuffer from position=3D0 up to the limit, = and truncates past the first 128 bytes. Hard to tell what is going on = with those because of that, but it does look like the buffer for the = name is actually stopping at the right place...the first column in the = top example ends in "10 49 5D 01 32 73 0D 48 03 85 09 CA F1 AF 6F 60 63" = (uuid 495d0132-730d-4803-8509-caf1af6f6063), the next ends in "10 78 CF = D5 25 A5 20 45 8E 85 84 25 94 15 B8 84 05"(uuid = 78cfd525-a520-458e-8584-259415b88405). >=20 > As you asked, I put in some more debugging to illustrate the bytes = returned in the column name. Below is one of the columns that fails: >=20 >=20 > get_slice for key: 80324d09-302b-4093-9708-e509091e5d8f supercolumn: = 0faced00057372002a6c696e676f74656b2e646f6373746f72652e43617373616e64726144= 6f63756d656e74245461726765749d0b9f071f4cb0410200024900076d5f70686173654c00= 066d5f6c616e677400124c6a6176612f6c616e672f537472696e673b787000000001740005= 64655f4445 subcolumn: [ cf=3D"TranslationsByTarget" = name=3D"78cfd525-a520-458e-8584-259415b88405"] > colParent:ColumnParent(column_family:TranslationsByTarget, = super_column:0F AC ED 00 05 73 72 00 2A 6C 69 6E 67 6F 74 65 6B 2E 64 6F = 63 73 74 6F 72 65 2E 43 61 73 73 61 6E 64 72 61 44 6F 63 75 6D 65 6E 74 = 24 54 61 72 67 65 74 9D 0B 9F 07 1F 4C B0 41 02 00 02 49 00 07 6D 5F 70 = 68 61 73 65 4C 00 06 6D 5F 6C 61 6E 67 74 00 12 4C 6A 61 76 61 2F 6C 61 = 6E 67 2F 53 74 72 69 6E 67 3B 78 70 00 00 00 01 74 00 05 64 65 5F 44 45) > predicate:SlicePredicate(column_names:[java.nio.HeapByteBuffer[pos=3D0 = lim=3D17 cap=3D17]]) > col: ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 67 = 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 05 0C 00 01 0B = 00 01 00 00 00 11 10 45 D9 BC E5 BE 02 48 9D B2 25 63 16 7B E4 B2 2C, = value:80 01 00 02 00 00 00 09 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F = 00 00 0C 00 00 00 05 0C 00 01 0B 00 01 00 00 00 11 10 45 D9 BC E5 BE 02 = 48 9D B2 25 63 16 7B E4 B2 2C 0B 00 02 00 00 00 11 10 29 F8 DC 1D 21 D7 = 49 DF B6 A6 46 B7 CE 81 EA 85, timestamp:1301329228377)) > col.getName(): 1045d9bce5be02489db22563167be4b22c > col: ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 67 = 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 05 0C 00 01 0B = 00 01 00 00 00 11 10 45 D9 BC E5 BE 02 48 9D B2 25 63 16 7B E4 B2 2C 0B = 00 02 00 00 00 11 10 29 F8 DC 1D 21 D7 49 DF B6 A6 46 B7 CE 81 EA 85 0A = 00 03 00 00 01 2E FD 44 32 59 00 00 0C 00 01 0B 00 01 00 00 00 11 10 52 = 8F EC B9 EE 94 43 31 AA AF AD A9 F7 33 DA DA, value:80 01 00 02 00 00 00 = 09 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 05 0C 00 = 01 0B 00 01 00 00 00 11 10 45 D9 BC E5 BE 02 48 9D B2 25 63 16 7B E4 B2 = 2C 0B 00 02 00 00 00 11 10 29 F8 DC 1D 21 D7 49 DF B6 A6 46 B7 CE 81 EA = 85 0A 00 03 00 00 01 2E FD 44 32 59 00 00 0C 00 01 0B 00 01 00 00 00 11 = 10 52 8F EC B9 EE 94 43 31 AA AF AD A9 F7 33 DA DA 0B 00 02 00 00 00 11 = 10..., timestamp:1301329222520)) > col.getName(): 10528fecb9ee944331aaafada9f733dada > col: ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 67 = 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 05 0C 00 01 0B = 00 01 00 00 00 11 10 45 D9 BC E5 BE 02 48 9D B2 25 63 16 7B E4 B2 2C 0B = 00 02 00 00 00 11 10 29 F8 DC 1D 21 D7 49 DF B6 A6 46 B7 CE 81 EA 85 0A = 00 03 00 00 01 2E FD 44 32 59 00 00 0C 00 01 0B 00 01 00 00 00 11 10 52 = 8F EC B9 EE 94 43 31 AA AF AD A9 F7 33 DA DA 0B 00 02 00 00 00 11 10..., = value:80 01 00 02 00 00 00 09 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F = 00 00 0C 00 00 00 05 0C 00 01 0B 00 01 00 00 00 11 10 45 D9 BC E5 BE 02 = 48 9D B2 25 63 16 7B E4 B2 2C 0B 00 02 00 00 00 11 10 29 F8 DC 1D 21 D7 = 49 DF B6 A6 46 B7 CE 81 EA 85 0A 00 03 00 00 01 2E FD 44 32 59 00 00 0C = 00 01 0B 00 01 00 00 00 11 10 52 8F EC B9 EE 94 43 31 AA AF AD A9 F7 33 = DA DA 0B 00 02 00 00 00 11 10..., timestamp:1301329262669)) > col.getName(): 10aa47bbbf14f34fd7a99386533b48d274 > col: ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 67 = 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 05 0C 00 01 0B = 00 01 00 00 00 11 10 45 D9 BC E5 BE 02 48 9D B2 25 63 16 7B E4 B2 2C 0B = 00 02 00 00 00 11 10 29 F8 DC 1D 21 D7 49 DF B6 A6 46 B7 CE 81 EA 85 0A = 00 03 00 00 01 2E FD 44 32 59 00 00 0C 00 01 0B 00 01 00 00 00 11 10 52 = 8F EC B9 EE 94 43 31 AA AF AD A9 F7 33 DA DA 0B 00 02 00 00 00 11 10..., = value:80 01 00 02 00 00 00 09 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F = 00 00 0C 00 00 00 05 0C 00 01 0B 00 01 00 00 00 11 10 45 D9 BC E5 BE 02 = 48 9D B2 25 63 16 7B E4 B2 2C 0B 00 02 00 00 00 11 10 29 F8 DC 1D 21 D7 = 49 DF B6 A6 46 B7 CE 81 EA 85 0A 00 03 00 00 01 2E FD 44 32 59 00 00 0C = 00 01 0B 00 01 00 00 00 11 10 52 8F EC B9 EE 94 43 31 AA AF AD A9 F7 33 = DA DA 0B 00 02 00 00 00 11 10..., timestamp:1301329219744)) > col.getName(): 10c44030c100cc46f5851772d1cb37cf12 > col: ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 67 = 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 05 0C 00 01 0B = 00 01 00 00 00 11 10 45 D9 BC E5 BE 02 48 9D B2 25 63 16 7B E4 B2 2C 0B = 00 02 00 00 00 11 10 29 F8 DC 1D 21 D7 49 DF B6 A6 46 B7 CE 81 EA 85 0A = 00 03 00 00 01 2E FD 44 32 59 00 00 0C 00 01 0B 00 01 00 00 00 11 10 52 = 8F EC B9 EE 94 43 31 AA AF AD A9 F7 33 DA DA 0B 00 02 00 00 00 11 10..., = value:80 01 00 02 00 00 00 09 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F = 00 00 0C 00 00 00 05 0C 00 01 0B 00 01 00 00 00 11 10 45 D9 BC E5 BE 02 = 48 9D B2 25 63 16 7B E4 B2 2C 0B 00 02 00 00 00 11 10 29 F8 DC 1D 21 D7 = 49 DF B6 A6 46 B7 CE 81 EA 85 0A 00 03 00 00 01 2E FD 44 32 59 00 00 0C = 00 01 0B 00 01 00 00 00 11 10 52 8F EC B9 EE 94 43 31 AA AF AD A9 F7 33 = DA DA 0B 00 02 00 00 00 11 10..., timestamp:1301327602293)) > col.getName(): 1078cfd525a520458e8584259415b88405 >=20 > The name bytes look good to me...the type byte("10") and then the = bytes for the UUID. I looked at the code for Column.getName(), there = are some utilities methods in thrift sources which returns the byte[] = subsequence from the buffer's position to the buffer's limit. I admit = that I am still learning about the internals of cassandra, but why would = the returned ByteBuffer contain all this extra data? Shouldn't there be = slice() done somewhere, if for no other reason than to reduce the = opportunity for buffer overflow/underflow? This sequence "67 65 74 5F = 73 6C 69 63 65" =3D=3D "get_slice" in ascii. Is the ByteBuffer simply = wrapping the response from thrift, and leaving the hows and whens of = extracting the pertinent bytes to the application code? >=20 > Abe >=20 > =20 > On Tue, Apr 19, 2011 at 3:00 PM, aaron morton = wrote: > Can you provide a little more info on what I'm seeing here. When name = is shown for the column, are you showing me the entire byte buffer for = the name or just up to limit ? >=20 > Aaron >=20 >=20 > On 20 Apr 2011, at 05:49, Abraham Sanderson wrote: >=20 >> Ok, set up a unit test for the supercolumns which seem to have = problems, I posted a few examples below. As I mentioned, the retrieved = bytes for the name and value appear to have additional data; in previous = tests the buffer's position, mark, and limit have been verified, and = when I call column.getName(), just the bytes for the name itself are = properly retrieved(if not I should be getting validation errors for the = custom uuid types, correct?). >>=20 >> Abe Sanderson >>=20 >> get_slice for key: 80324d09-302b-4093-9708-e509091e5d8f supercolumn: = 0faced00057372002a6c696e676f74656b2e646f6373746f72652e43617373616e64726144= 6f63756d656e74245461726765749d0b9f071f4cb0410200024900076d5f70686173654c00= 066d5f6c616e677400124c6a6176612f6c616e672f537472696e673b787000000001740005= 64655f4445 subcolumn: [ cf=3D"TranslationsByTarget" = name=3D"78cfd525-a520-458e-8584-259415b88405"] >> colParent:ColumnParent(column_family:TranslationsByTarget, = super_column:0F AC ED 00 05 73 72 00 2A 6C 69 6E 67 6F 74 65 6B 2E 64 6F = 63 73 74 6F 72 65 2E 43 61 73 73 61 6E 64 72 61 44 6F 63 75 6D 65 6E 74 = 24 54 61 72 67 65 74 9D 0B 9F 07 1F 4C B0 41 02 00 02 49 00 07 6D 5F 70 = 68 61 73 65 4C 00 06 6D 5F 6C 61 6E 67 74 00 12 4C 6A 61 76 61 2F 6C 61 = 6E 67 2F 53 74 72 69 6E 67 3B 78 70 00 00 00 01 74 00 05 64 65 5F 44 45) >> predicate:SlicePredicate(column_names:[java.nio.HeapByteBuffer[pos=3D0 = lim=3D17 cap=3D17]]) >> col: ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 = 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 04 0C 00 01 = 0B 00 01 00 00 00 11 10 49 5D 01 32 73 0D 48 03 85 09 CA F1 AF 6F 60 63, = value:80 01 00 02 00 00 00 09 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F = 00 00 0C 00 00 00 04 0C 00 01 0B 00 01 00 00 00 11 10 49 5D 01 32 73 0D = 48 03 85 09 CA F1 AF 6F 60 63 0B 00 02 00 00 00 11 10 FC 0A 0D 43 B1 E0 = 44 F9 96 AA FC EE 41 EC 40 7E, timestamp:1301327609539)) >> col: ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 = 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 04 0C 00 01 = 0B 00 01 00 00 00 11 10 49 5D 01 32 73 0D 48 03 85 09 CA F1 AF 6F 60 63 = 0B 00 02 00 00 00 11 10 FC 0A 0D 43 B1 E0 44 F9 96 AA FC EE 41 EC 40 7E = 0A 00 03 00 00 01 2E FD 2B 7E C3 00 00 0C 00 01 0B 00 01 00 00 00 11 10 = 78 CF D5 25 A5 20 45 8E 85 84 25 94 15 B8 84 05, value:80 01 00 02 00 00 = 00 09 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 04 0C = 00 01 0B 00 01 00 00 00 11 10 49 5D 01 32 73 0D 48 03 85 09 CA F1 AF 6F = 60 63 0B 00 02 00 00 00 11 10 FC 0A 0D 43 B1 E0 44 F9 96 AA FC EE 41 EC = 40 7E 0A 00 03 00 00 01 2E FD 2B 7E C3 00 00 0C 00 01 0B 00 01 00 00 00 = 11 10 78 CF D5 25 A5 20 45 8E 85 84 25 94 15 B8 84 05 0B 00 02 00 00 00 = 11 10..., timestamp:1301327602293)) >> col: ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 = 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 04 0C 00 01 = 0B 00 01 00 00 00 11 10 49 5D 01 32 73 0D 48 03 85 09 CA F1 AF 6F 60 63 = 0B 00 02 00 00 00 11 10 FC 0A 0D 43 B1 E0 44 F9 96 AA FC EE 41 EC 40 7E = 0A 00 03 00 00 01 2E FD 2B 7E C3 00 00 0C 00 01 0B 00 01 00 00 00 11 10 = 78 CF D5 25 A5 20 45 8E 85 84 25 94 15 B8 84 05 0B 00 02 00 00 00 11 = 10..., value:80 01 00 02 00 00 00 09 67 65 74 5F 73 6C 69 63 65 00 00 00 = 02 0F 00 00 0C 00 00 00 04 0C 00 01 0B 00 01 00 00 00 11 10 49 5D 01 32 = 73 0D 48 03 85 09 CA F1 AF 6F 60 63 0B 00 02 00 00 00 11 10 FC 0A 0D 43 = B1 E0 44 F9 96 AA FC EE 41 EC 40 7E 0A 00 03 00 00 01 2E FD 2B 7E C3 00 = 00 0C 00 01 0B 00 01 00 00 00 11 10 78 CF D5 25 A5 20 45 8E 85 84 25 94 = 15 B8 84 05 0B 00 02 00 00 00 11 10..., timestamp:1301327589704)) >> col: ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 = 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 04 0C 00 01 = 0B 00 01 00 00 00 11 10 49 5D 01 32 73 0D 48 03 85 09 CA F1 AF 6F 60 63 = 0B 00 02 00 00 00 11 10 FC 0A 0D 43 B1 E0 44 F9 96 AA FC EE 41 EC 40 7E = 0A 00 03 00 00 01 2E FD 2B 7E C3 00 00 0C 00 01 0B 00 01 00 00 00 11 10 = 78 CF D5 25 A5 20 45 8E 85 84 25 94 15 B8 84 05 0B 00 02 00 00 00 11 = 10..., value:80 01 00 02 00 00 00 09 67 65 74 5F 73 6C 69 63 65 00 00 00 = 02 0F 00 00 0C 00 00 00 04 0C 00 01 0B 00 01 00 00 00 11 10 49 5D 01 32 = 73 0D 48 03 85 09 CA F1 AF 6F 60 63 0B 00 02 00 00 00 11 10 FC 0A 0D 43 = B1 E0 44 F9 96 AA FC EE 41 EC 40 7E 0A 00 03 00 00 01 2E FD 2B 7E C3 00 = 00 0C 00 01 0B 00 01 00 00 00 11 10 78 CF D5 25 A5 20 45 8E 85 84 25 94 = 15 B8 84 05 0B 00 02 00 00 00 11 10..., timestamp:1301327594118)) >>=20 >>=20 >> get_slice for key: d1c7f6b9-1425-4fab-b074-5574c54cae08 supercolumn: = 0faced00057372002a6c696e676f74656b2e646f6373746f72652e43617373616e64726144= 6f63756d656e74245461726765749d0b9f071f4cb0410200024900076d5f70686173654c00= 066d5f6c616e677400124c6a6176612f6c616e672f537472696e673b787000000001740005= 64655f4445 subcolumn: [ cf=3D"TranslationsByTarget" = name=3D"b2f33b97-69f4-45ec-ad87-dd14ee60d719"] >> colParent:ColumnParent(column_family:TranslationsByTarget, = super_column:0F AC ED 00 05 73 72 00 2A 6C 69 6E 67 6F 74 65 6B 2E 64 6F = 63 73 74 6F 72 65 2E 43 61 73 73 61 6E 64 72 61 44 6F 63 75 6D 65 6E 74 = 24 54 61 72 67 65 74 9D 0B 9F 07 1F 4C B0 41 02 00 02 49 00 07 6D 5F 70 = 68 61 73 65 4C 00 06 6D 5F 6C 61 6E 67 74 00 12 4C 6A 61 76 61 2F 6C 61 = 6E 67 2F 53 74 72 69 6E 67 3B 78 70 00 00 00 01 74 00 05 64 65 5F 44 45) >> predicate:SlicePredicate(column_names:[java.nio.HeapByteBuffer[pos=3D0 = lim=3D17 cap=3D17]]) >> col: ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 = 67 65 74 5F 73 6C 69 63 65 00 00 00 04 0F 00 00 0C 00 00 00 02 0C 00 01 = 0B 00 01 00 00 00 11 10 7C 2F 5D 5B B3 70 42 E1 A6 A2 77 FC 72 14 40 FE, = value:80 01 00 02 00 00 00 09 67 65 74 5F 73 6C 69 63 65 00 00 00 04 0F = 00 00 0C 00 00 00 02 0C 00 01 0B 00 01 00 00 00 11 10 7C 2F 5D 5B B3 70 = 42 E1 A6 A2 77 FC 72 14 40 FE 0B 00 02 00 00 00 11 10 B4 64 74 19 F9 44 = 4E A3 A5 F9 06 32 67 DB 33 19, timestamp:1301324860465)) >> col: ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 = 67 65 74 5F 73 6C 69 63 65 00 00 00 04 0F 00 00 0C 00 00 00 02 0C 00 01 = 0B 00 01 00 00 00 11 10 7C 2F 5D 5B B3 70 42 E1 A6 A2 77 FC 72 14 40 FE = 0B 00 02 00 00 00 11 10 B4 64 74 19 F9 44 4E A3 A5 F9 06 32 67 DB 33 19 = 0A 00 03 00 00 01 2E FD 01 8C 31 00 00 0C 00 01 0B 00 01 00 00 00 11 10 = B2 F3 3B 97 69 F4 45 EC AD 87 DD 14 EE 60 D7 19, value:80 01 00 02 00 00 = 00 09 67 65 74 5F 73 6C 69 63 65 00 00 00 04 0F 00 00 0C 00 00 00 02 0C = 00 01 0B 00 01 00 00 00 11 10 7C 2F 5D 5B B3 70 42 E1 A6 A2 77 FC 72 14 = 40 FE 0B 00 02 00 00 00 11 10 B4 64 74 19 F9 44 4E A3 A5 F9 06 32 67 DB = 33 19 0A 00 03 00 00 01 2E FD 01 8C 31 00 00 0C 00 01 0B 00 01 00 00 00 = 11 10 B2 F3 3B 97 69 F4 45 EC AD 87 DD 14 EE 60 D7 19 0B 00 02 00 00 00 = 11 10..., timestamp:1301325719735)) >>=20 >>=20 >> get_slice for key: 18b4acd1-5491-44d3-aaa1-b725f51d1c3b supercolumn: = 0faced00057372002a6c696e676f74656b2e646f6373746f72652e43617373616e64726144= 6f63756d656e74245461726765749d0b9f071f4cb0410200024900076d5f70686173654c00= 066d5f6c616e677400124c6a6176612f6c616e672f537472696e673b787000000001740005= 706c5f504c subcolumn: [ cf=3D"TranslationsByTarget" = name=3D"3da78c49-a8aa-4fdb-8238-1ade458426b5"] >> colParent:ColumnParent(column_family:TranslationsByTarget, = super_column:0F AC ED 00 05 73 72 00 2A 6C 69 6E 67 6F 74 65 6B 2E 64 6F = 63 73 74 6F 72 65 2E 43 61 73 73 61 6E 64 72 61 44 6F 63 75 6D 65 6E 74 = 24 54 61 72 67 65 74 9D 0B 9F 07 1F 4C B0 41 02 00 02 49 00 07 6D 5F 70 = 68 61 73 65 4C 00 06 6D 5F 6C 61 6E 67 74 00 12 4C 6A 61 76 61 2F 6C 61 = 6E 67 2F 53 74 72 69 6E 67 3B 78 70 00 00 00 01 74 00 05 70 6C 5F 50 4C) >> predicate:SlicePredicate(column_names:[java.nio.HeapByteBuffer[pos=3D0 = lim=3D17 cap=3D17]]) >> col: ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 = 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 03 0C 00 01 = 0B 00 01 00 00 00 11 10 24 D4 2C 7F 2D C3 4A 80 B3 FF 5B A3 77 AF 2E BD, = value:80 01 00 02 00 00 00 09 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F = 00 00 0C 00 00 00 03 0C 00 01 0B 00 01 00 00 00 11 10 24 D4 2C 7F 2D C3 = 4A 80 B3 FF 5B A3 77 AF 2E BD 0B 00 02 00 00 00 11 10 62 58 73 23 CB 37 = 4F B5 BD DD BC F5 1E 7F E7 65, timestamp:1301000346861)) >> col: ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 = 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 03 0C 00 01 = 0B 00 01 00 00 00 11 10 24 D4 2C 7F 2D C3 4A 80 B3 FF 5B A3 77 AF 2E BD = 0B 00 02 00 00 00 11 10 62 58 73 23 CB 37 4F B5 BD DD BC F5 1E 7F E7 65 = 0A 00 03 00 00 01 2E E9 A9 DC ED 00 00 0C 00 01 0B 00 01 00 00 00 11 10 = 3D A7 8C 49 A8 AA 4F DB 82 38 1A DE 45 84 26 B5, value:80 01 00 02 00 00 = 00 09 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 03 0C = 00 01 0B 00 01 00 00 00 11 10 24 D4 2C 7F 2D C3 4A 80 B3 FF 5B A3 77 AF = 2E BD 0B 00 02 00 00 00 11 10 62 58 73 23 CB 37 4F B5 BD DD BC F5 1E 7F = E7 65 0A 00 03 00 00 01 2E E9 A9 DC ED 00 00 0C 00 01 0B 00 01 00 00 00 = 11 10 3D A7 8C 49 A8 AA 4F DB 82 38 1A DE 45 84 26 B5 0B 00 02 00 00 00 = 11 10..., timestamp:1301000346885)) >> col: ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 = 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 03 0C 00 01 = 0B 00 01 00 00 00 11 10 24 D4 2C 7F 2D C3 4A 80 B3 FF 5B A3 77 AF 2E BD = 0B 00 02 00 00 00 11 10 62 58 73 23 CB 37 4F B5 BD DD BC F5 1E 7F E7 65 = 0A 00 03 00 00 01 2E E9 A9 DC ED 00 00 0C 00 01 0B 00 01 00 00 00 11 10 = 3D A7 8C 49 A8 AA 4F DB 82 38 1A DE 45 84 26 B5 0B 00 02 00 00 00 11 = 10..., value:80 01 00 02 00 00 00 09 67 65 74 5F 73 6C 69 63 65 00 00 00 = 02 0F 00 00 0C 00 00 00 03 0C 00 01 0B 00 01 00 00 00 11 10 24 D4 2C 7F = 2D C3 4A 80 B3 FF 5B A3 77 AF 2E BD 0B 00 02 00 00 00 11 10 62 58 73 23 = CB 37 4F B5 BD DD BC F5 1E 7F E7 65 0A 00 03 00 00 01 2E E9 A9 DC ED 00 = 00 0C 00 01 0B 00 01 00 00 00 11 10 3D A7 8C 49 A8 AA 4F DB 82 38 1A DE = 45 84 26 B5 0B 00 02 00 00 00 11 10..., timestamp:1301000346836)) >>=20 >> On Mon, Apr 18, 2011 at 5:41 PM, aaron morton = wrote: >> Can you could provide an example of a get_slice request that failed = and the columns that were returned, so we can see the actual bytes for = the super column and column names. >>=20 >> Aaron >>=20 >>=20 >> On 19 Apr 2011, at 09:26, Abraham Sanderson wrote: >>=20 >>> I wish it were consistent enough that the answer were simple... It = varies between just the requested subcolumn to all subcolumns. It = always does return the columns in order, and the requested column is = always one of the columns returned. However, the slice start is not = consistently in the same place(like n+1 or n-1). For example, if I have = CF['key']['supercolumn' ['a','b','c','d','e']], and query for 'c', = sometimes i get a slice with 'a', 'b', 'c', other times its 'b', 'c', = 'd', sometimes 'c', 'd'. When the column name is closer to the end of = the range('d' or 'e'), sometimes it justs a slice with the column. The = sporadic behavior makes me think that it's a race condition, but the = behavior linked to the column range makes we think I'm overrunning the = buffer somewhere. I at first suspected that I was inadvertently making = modifications to the buffers in application code during = serialization/deserialization, so I did the tests in the cli. This = limits it to just cassandra/thrift code and my custom types. Am I = missing some other factor? While debugging I have noticed that the byte = buffers contain more than they used to; it looks to me like tokens that = contain parts of the thrift response. I'd see strings like = "???get_slice???Foo??7c2f5d5b-b370-42e1-a6a2-77fc721440fe????" Is it = possible that I am inadvertently using a reserved token or something on = my supercolumn name and this is screwing with the slice command? >>>=20 >>> Abe >>>=20 >>> On Mon, Apr 18, 2011 at 2:55 PM, aaron morton = wrote: >>> When you run the get_slice which columns are returned ?=20 >>>=20 >>>=20 >>> Aaron >>>=20 >>> On 19 Apr 2011, at 04:12, Abraham Sanderson wrote: >>>=20 >>>> Ok, I made the changes and tried again. Here is the before = modifying my method using a simple get, confirmed the same output in the = cli: >>>>=20 >>>> DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,910 = CassandraServer.java (line 279) get >>>> DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,911 StorageProxy.java = (line 322) Command/ConsistencyLevel is = SliceByNamesReadCommand(table=3D'DocStore', = key=3D64316337663662392d313432352d346661622d623037342d35353734633534636165= 3038, columnParent=3D'QueryPath(columnFamilyName=3D'Tran >>>> slationsByTarget', superColumnName=3D'java.nio.HeapByteBuffer[pos=3D9= 5 lim=3D211 cap=3D244]', columnName=3D'null')', = columns=3D[7c2f5d5b-b370-42e1-a6a2-77fc721440fe,])/ALL >>>> DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,911 ReadCallback.java = (line 84) Blockfor/repair is 1/true; setting up requests to = localhost/127.0.0.1 >>>> DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,911 StorageProxy.java = (line 345) reading data locally >>>> DEBUG [ReadStage:4] 2011-04-18 09:37:23,911 StorageProxy.java (line = 450) LocalReadRunnable reading SliceByNamesReadCommand(table=3D'DocStore',= = key=3D64316337663662392d313432352d346661622d623037342d35353734633534636165= 3038, columnParent=3D'QueryPath(columnFamilyName=3D'Translatio >>>> nsByTarget', superColumnName=3D'java.nio.HeapByteBuffer[pos=3D95 = lim=3D211 cap=3D244]', columnName=3D'null')', = columns=3D[7c2f5d5b-b370-42e1-a6a2-77fc721440fe,]) >>>> DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,912 StorageProxy.java = (line 395) Read: 1 ms. >>>> ERROR [pool-1-thread-2] 2011-04-18 09:37:23,912 Cassandra.java = (line 2665) Internal error processing get >>>> java.lang.AssertionError >>>> at = org.apache.cassandra.thrift.CassandraServer.get(CassandraServer.java:300) >>>> at = org.apache.cassandra.thrift.Cassandra$Processor$get.process(Cassandra.java= :2655) >>>> at = org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:255= 5) >>>> at = org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(Cust= omTThreadPoolServer.java:206) >>>> at = java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:= 1110) >>>> at = java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java= :603) >>>> at java.lang.Thread.run(Thread.java:636) >>>>=20 >>>> And here is the after...it succeeds here but still gives me = multiple subcolumns in the response. Same behavior, it seems, I'm just = sidestepping the original AssertionError: >>>>=20 >>>> DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,617 = CassandraServer.java (line 232) get_slice >>>> DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,617 StorageProxy.java = (line 322) Command/ConsistencyLevel is = SliceByNamesReadCommand(table=3D'DocStore', = key=3D64316337663662392d313432352d346661622d623037342d35353734633534636165= 3038, columnParent=3D'QueryPath(columnFamilyName=3D'TranslationsByTarget',= superColumnName=3D'java.nio.HeapByteBuffer[pos=3D101 lim=3D217 = cap=3D259]', columnName=3D'null')', = columns=3D[7c2f5d5b-b370-42e1-a6a2-77fc721440fe,])/ALL >>>> DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,617 ReadCallback.java = (line 84) Blockfor/repair is 1/true; setting up requests to = localhost/127.0.0.1 >>>> DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,617 StorageProxy.java = (line 345) reading data locally >>>> DEBUG [ReadStage:3] 2011-04-18 09:50:26,618 StorageProxy.java (line = 450) LocalReadRunnable reading SliceByNamesReadCommand(table=3D'DocStore',= = key=3D64316337663662392d313432352d346661622d623037342d35353734633534636165= 3038, columnParent=3D'QueryPath(columnFamilyName=3D'TranslationsByTarget',= superColumnName=3D'java.nio.HeapByteBuffer[pos=3D101 lim=3D217 = cap=3D259]', columnName=3D'null')', = columns=3D[7c2f5d5b-b370-42e1-a6a2-77fc721440fe,]) >>>> DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,618 StorageProxy.java = (line 395) Read: 0 ms. >>>>=20 >>>> My comparators are relatively simple. Basically I have a schema = that required heterogenous columns, but I needed to be able to = deserialize them in unique ways. So there is always a type byte that = precedes the bytes of the data. The supercolumn in this case is a = general data type, which happens to represent a serializable object: >>>>=20 >>>> public void validate(ByteBuffer bytes) >>>> throws MarshalException >>>> { >>>> if(bytes.remaining() =3D=3D 0) >>>> return; >>>>=20 >>>> validateDataType(bytes.get(bytes.position())); >>>> return; >>>> } >>>>=20 >>>> public int compare(ByteBuffer bytes1, ByteBuffer bytes2) >>>> { >>>> if (bytes1.remaining() =3D=3D 0) >>>> return bytes2.remaining() =3D=3D 0 ? 0 : -1; >>>> else if (bytes2.remaining() =3D=3D 0) >>>> return 1; >>>> else >>>> { >>>> // compare type bytes = = = =20 >>>> byte T1 =3D bytes1.get(bytes1.position()); >>>> byte T2 =3D bytes2.get(bytes2.position()); >>>> if (T1 !=3D T2) >>>> return (T1 - T2); >>>>=20 >>>> // compare values = = = =20 >>>> return ByteBufferUtil.compareUnsigned(bytes1, bytes2); >>>> } >>>> } >>>>=20 >>>> The subcolumn is similar...just a UUID with a type byte prefix: >>>>=20 >>>> public void validate(ByteBuffer bytes) >>>> throws MarshalException >>>> { >>>> if(bytes.remaining() =3D=3D 0) >>>> return; >>>>=20 >>>> validateDataType(bytes.get(bytes.position())); >>>> if((bytes.remaining() - 1) =3D=3D 0) >>>> return; >>>> else if((bytes.remaining() - 1) !=3D 16) >>>> throw new MarshalException("UUID value must be exactly 16 = bytes"); >>>> } >>>>=20 >>>> public int compare(ByteBuffer bytes1, ByteBuffer bytes2) >>>> { >>>> if (bytes1.remaining() =3D=3D 0) >>>> return bytes2.remaining() =3D=3D 0 ? 0 : -1; >>>> else if (bytes2.remaining() =3D=3D 0) >>>> return 1; >>>> else >>>> { >>>> // compare type bytes = = = =20 >>>> byte T1 =3D bytes1.get(bytes1.position()); >>>> byte T2 =3D bytes2.get(bytes2.position()); >>>> if (T1 !=3D T2) >>>> return (T1 - T2); >>>>=20 >>>> // compare values = = = =20 >>>> UUID U1 =3D getUUID(bytes1, bytes1.position()+1); >>>> UUID U2 =3D getUUID(bytes2, bytes2.position()+1); >>>> return U1.compareTo(U2); >>>> } >>>> } >>>>=20 >>>> static UUID getUUID(ByteBuffer bytes, int pos) >>>> { >>>> long msBits =3D bytes.getLong(pos); >>>> long lsBits =3D bytes.getLong(pos+8); >>>> return new UUID(msBits, lsBits); >>>> } >>>>=20 >>>> All of my buffer reads are done by index, the position shouldn't be = changing at all. >>>>=20 >>>> Abe Sanderson >>>>=20 >>>> On Sat, Apr 16, 2011 at 5:38 PM, aaron morton = wrote: >>>> Can you run the same request as a get_slice naming the column in = the SlicePredicate and see what comes back ? >>>>=20 >>>> Can you reproduce the fault with logging set at DEBUG and send the = logs ? >>>>=20 >>>> Also, whats the compare function like for your custom type ? >>>>=20 >>>> Cheers >>>> Aaron >>>>=20 >>>>=20 >>>> On 16 Apr 2011, at 07:34, Abraham Sanderson wrote: >>>>=20 >>>> > I'm having some issues with a few of my ColumnFamilies after a = cassandra upgrade/import from 0.6.1 to 0.7.4. I followed the = instructions to upgrade and everything seem to work OK...until I got = into the application and noticed some wierd behavior. I was getting the = following stacktrace in cassandra occassionally when I did get = operations for a single subcolumn for some of the Super type CFs: >>>> > >>>> > ERROR 12:56:05,669 Internal error processing get >>>> > java.lang.AssertionError >>>> > at org.apache.cassandra.thrift. >>>> > CassandraServer.get(CassandraServer.java:300) >>>> > at = org.apache.cassandra.thrift.Cassandra$Processor$get.process(Cassandra.java= :2655) >>>> > at = org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:255= 5) >>>> > at = org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(Cust= omTThreadPoolServer.java:206) >>>> > at = java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:= 1110) >>>> > at = java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java= :603) >>>> > at java.lang.Thread.run(Thread.java:636) >>>> > >>>> > The assertion that is failing is the check that only one column = is retrieved by the get. I did some debugging with the cli and a remote = debugger and found a few interesting patterns. First, the problem does = not seem consistently duplicatable. If one supercolumn is affected = though, it will happen more frequently for subcolumns that when sorted = appear at the beginning of the range. For columns near the end of the = range, it seems to be more intermittent, and almost never occurs when I = step through the code line by line. The only factor I can think of that = might cause issues is that I am using custom data types for all = supercolumns and columns. I originally thought I might be reading past = the end of the ByteBuffer, but I have quadrupled checked that this is = not the case. >>>> > >>>> > Abe Sanderson >>>>=20 >>>>=20 >>>=20 >>>=20 >>=20 >>=20 >=20 >=20 >=20 --Apple-Mail-5--929460600 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii Some = followup from discussions on IRC...

Problem here was = changing the sort order for sub columns. Cassandra expects that the data = on disk is correctly ordered. 

We put = together this hack https://gist.github.com/941445=  to force the sub columns to be re-ordered when read, and then ran = nodetool scrub to re-write the data. 

Last = I heard the tests went = OK. 

Aaron


On 23 Apr 2011, at 05:14, Abraham Sanderson wrote:

  I = did some more sleuth work and found out what's going on.  The 0.6 = data was serialized using the wrong compare method, and as a result, = when importing data into 0.7, the presorted wrapper for the subcolumns = would misbehave on some operations(like remove()).  The supercolumn = operation gets all the columns then filters down to the requested column = list/range.  The remove() fails because the underlying map for = column names/IColumn does not start at the same place the comparator = expects it to.  The operation looks at the first element, and if = the comparator returns -1 for compareTo(key, firstKey) it assumes the = key is not found in the range.

  The reason this happened appears to be that there was a typo = in my 0.6 keyspace definition...instead of CompareSubcolumnsWith(small = c) I used the key CompareSubColumnsWith(big C). Doh!  I'm assuming = that there is no validation check on the keys in the storage conf file, = and since subcolumn comparator is optional, the file loaded just = fine.  As a result my subcolumn comparator was ignored and the = BytesType was used instead.  So the big question is how the heck do = I fix it?  This would be similar to a case where I change the type = of a column, say switch from BytesType to AsciiType in order to sort it = in a different fashion.

On Tue, Apr 19, 2011 at 4:38 PM, Abraham = Sanderson <asanderson@lingotek.com> wrote:
Aaron,

  I'll try my best...I'm still trying to make heads = or tails of the output as well.  The first line is debugging output = from me; just printing the values for key, supercolumn name, and the = wrapper class I've built for the subcolumn.  This was prior to = 0.7.1, so the key is a String e.g. = "80324d09-302b-4093-9708-e509091e5d8"; the supercolumn is a custom = serializable object type, the first byte "0F" is for the type, the rest = of the sequence "AC ED ... 44 45" is byte array which is backing an = ObjectOutputStream; the subcolumn is my own construct, the name in this = case is the custom uuid type, represented by the type byte "10" and then = the bytes of the UUID "78 CF D5 25 A5 20 45 8E 85 84 25 94 15 B8 84 05".  I then did a print = of the column parent and predicate being used for the get_slice command, = just to be sure that everything matches.  The methods for that are = part of cassandra code.  Then I do a print of the = ColumnOrSuperColumn returned by the slice command.  It looks to me = like not all the bytes are shown in some of the cases...I looked up the = thrift source code used by cassandra and it looks like what is displayed = is the ByteBuffer from position=3D0 up to the limit, and truncates past = the first 128 bytes.  Hard to tell what is going on with those = because of that, but it does look like the buffer for the name is = actually stopping at the right place...the first column in the top = example ends in "10 49 5D 01 32 73 0D 48 03 85 09 CA F1 AF 6F 60 63" = (uuid 495d0132-730d-4803-8509-caf1af6f6063), the next ends in "10 78 CF = D5 25 A5 20 45 8E 85 84 25 94 15 B8 84 = 05"(uuid 78cfd525-a520-458e-8584-259415b88405).

As you asked, I put in some more debugging to illustrate the bytes = returned in the column name.  Below is one of the columns that = fails:


get_slice for key: = 80324d09-302b-4093-9708-e509091e5d8f supercolumn: = 0faced00057372002a6c696e676f74656b2e646f6373746f72652e43617373616e64726144= 6f63756d656e74245461726765749d0b9f071f4cb0410200024900076d5f70686173654c00= 066d5f6c616e677400124c6a6176612f6c616e672f537472696e673b787000000001740005= 64655f4445 subcolumn: [ cf=3D"TranslationsByTarget" = name=3D"78cfd525-a520-458e-8584-259415b88405"]
colParent:ColumnParent(column_family:TranslationsByTarget, = super_column:0F AC ED 00 05 73 72 00 2A 6C 69 6E 67 6F 74 65 6B 2E 64 6F = 63 73 74 6F 72 65 2E 43 61 73 73 61 6E 64 72 61 44 6F 63 75 6D 65 6E 74 = 24 54 61 72 67 65 74 9D 0B 9F 07 1F 4C B0 41 02 00 02 49 00 07 6D 5F 70 = 68 61 73 65 4C 00 06 6D 5F 6C 61 6E 67 74 00 12 4C 6A 61 76 61 2F 6C 61 = 6E 67 2F 53 74 72 69 6E 67 3B 78 70 00 00 00 01 74 00 05 64 65 5F 44 = 45)
predicate:SlicePredicate(column_names:[java.nio.HeapByteBuffer[pos=3D0 = lim=3D17 cap=3D17]])
col: = ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 67 65 74 = 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 05 0C 00 01 0B 00 01 = 00 00 00 11 10 45 D9 BC E5 BE 02 48 9D B2 25 63 16 7B E4 B2 2C, value:80 = 01 00 02 00 00 00 09 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C = 00 00 00 05 0C 00 01 0B 00 01 00 00 00 11 10 45 D9 BC E5 BE 02 48 9D B2 = 25 63 16 7B E4 B2 2C 0B 00 02 00 00 00 11 10 29 F8 DC 1D 21 D7 49 DF B6 = A6 46 B7 CE 81 EA 85, timestamp:1301329228377))
col.getName(): 1045d9bce5be02489db22563167be4b22c
col: = ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 67 65 74 = 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 05 0C 00 01 0B 00 01 = 00 00 00 11 10 45 D9 BC E5 BE 02 48 9D B2 25 63 16 7B E4 B2 2C 0B 00 02 = 00 00 00 11 10 29 F8 DC 1D 21 D7 49 DF B6 A6 46 B7 CE 81 EA 85 0A 00 03 = 00 00 01 2E FD 44 32 59 00 00 0C 00 01 0B 00 01 00 00 00 11 10 52 8F EC = B9 EE 94 43 31 AA AF AD A9 F7 33 DA DA, value:80 01 00 02 00 00 00 09 67 = 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 05 0C 00 01 0B = 00 01 00 00 00 11 10 45 D9 BC E5 BE 02 48 9D B2 25 63 16 7B E4 B2 2C 0B = 00 02 00 00 00 11 10 29 F8 DC 1D 21 D7 49 DF B6 A6 46 B7 CE 81 EA 85 0A = 00 03 00 00 01 2E FD 44 32 59 00 00 0C 00 01 0B 00 01 00 00 00 11 10 52 = 8F EC B9 EE 94 43 31 AA AF AD A9 F7 33 DA DA 0B 00 02 00 00 00 11 10..., = timestamp:1301329222520))
col.getName(): 10528fecb9ee944331aaafada9f733dada
col: = ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 67 65 74 = 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 05 0C 00 01 0B 00 01 = 00 00 00 11 10 45 D9 BC E5 BE 02 48 9D B2 25 63 16 7B E4 B2 2C 0B 00 02 = 00 00 00 11 10 29 F8 DC 1D 21 D7 49 DF B6 A6 46 B7 CE 81 EA 85 0A 00 03 = 00 00 01 2E FD 44 32 59 00 00 0C 00 01 0B 00 01 00 00 00 11 10 52 8F EC = B9 EE 94 43 31 AA AF AD A9 F7 33 DA DA 0B 00 02 00 00 00 11 10..., = value:80 01 00 02 00 00 00 09 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F = 00 00 0C 00 00 00 05 0C 00 01 0B 00 01 00 00 00 11 10 45 D9 BC E5 BE 02 = 48 9D B2 25 63 16 7B E4 B2 2C 0B 00 02 00 00 00 11 10 29 F8 DC 1D 21 D7 = 49 DF B6 A6 46 B7 CE 81 EA 85 0A 00 03 00 00 01 2E FD 44 32 59 00 00 0C = 00 01 0B 00 01 00 00 00 11 10 52 8F EC B9 EE 94 43 31 AA AF AD A9 F7 33 = DA DA 0B 00 02 00 00 00 11 10..., timestamp:1301329262669))
col.getName(): 10aa47bbbf14f34fd7a99386533b48d274
col: = ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 67 65 74 = 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 05 0C 00 01 0B 00 01 = 00 00 00 11 10 45 D9 BC E5 BE 02 48 9D B2 25 63 16 7B E4 B2 2C 0B 00 02 = 00 00 00 11 10 29 F8 DC 1D 21 D7 49 DF B6 A6 46 B7 CE 81 EA 85 0A 00 03 = 00 00 01 2E FD 44 32 59 00 00 0C 00 01 0B 00 01 00 00 00 11 10 52 8F EC = B9 EE 94 43 31 AA AF AD A9 F7 33 DA DA 0B 00 02 00 00 00 11 10..., = value:80 01 00 02 00 00 00 09 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F = 00 00 0C 00 00 00 05 0C 00 01 0B 00 01 00 00 00 11 10 45 D9 BC E5 BE 02 = 48 9D B2 25 63 16 7B E4 B2 2C 0B 00 02 00 00 00 11 10 29 F8 DC 1D 21 D7 = 49 DF B6 A6 46 B7 CE 81 EA 85 0A 00 03 00 00 01 2E FD 44 32 59 00 00 0C = 00 01 0B 00 01 00 00 00 11 10 52 8F EC B9 EE 94 43 31 AA AF AD A9 F7 33 = DA DA 0B 00 02 00 00 00 11 10..., timestamp:1301329219744))
col.getName(): 10c44030c100cc46f5851772d1cb37cf12
col: = ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 67 65 74 = 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 05 0C 00 01 0B 00 01 = 00 00 00 11 10 45 D9 BC E5 BE 02 48 9D B2 25 63 16 7B E4 B2 2C 0B 00 02 = 00 00 00 11 10 29 F8 DC 1D 21 D7 49 DF B6 A6 46 B7 CE 81 EA 85 0A 00 03 = 00 00 01 2E FD 44 32 59 00 00 0C 00 01 0B 00 01 00 00 00 11 10 52 8F EC = B9 EE 94 43 31 AA AF AD A9 F7 33 DA DA 0B 00 02 00 00 00 11 10..., = value:80 01 00 02 00 00 00 09 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F = 00 00 0C 00 00 00 05 0C 00 01 0B 00 01 00 00 00 11 10 45 D9 BC E5 BE 02 = 48 9D B2 25 63 16 7B E4 B2 2C 0B 00 02 00 00 00 11 10 29 F8 DC 1D 21 D7 = 49 DF B6 A6 46 B7 CE 81 EA 85 0A 00 03 00 00 01 2E FD 44 32 59 00 00 0C = 00 01 0B 00 01 00 00 00 11 10 52 8F EC B9 EE 94 43 31 AA AF AD A9 F7 33 = DA DA 0B 00 02 00 00 00 11 10..., timestamp:1301327602293))
col.getName(): 1078cfd525a520458e8584259415b88405

The name bytes = look good to me...the type byte("10") and then the bytes for the = UUID.  I looked at the code for Column.getName(), there are some = utilities methods in thrift sources which returns the byte[] subsequence = from the buffer's position to the buffer's limit.  I admit that I = am still learning about the internals of cassandra, but why would the = returned ByteBuffer contain all this extra data?  Shouldn't there = be slice() done somewhere, if for no other reason than to reduce the = opportunity for buffer overflow/underflow?  This sequence "67 65 74 = 5F 73 6C 69 63 65" =3D=3D "get_slice" in ascii.  Is the ByteBuffer = simply wrapping the response from thrift, and leaving the hows and whens = of extracting the pertinent bytes to the application code?

Abe

 
On Tue, Apr 19, 2011 at 3:00 PM, aaron morton = <aaron@thelastpickle.com> wrote:
Can you provide a little more = info on what I'm seeing here. When name is shown for the column, are you = showing me the entire byte buffer for the name or just up to limit = ?

Aaron


On 20 Apr 2011, at 05:49, Abraham Sanderson = wrote:

Ok, set up a unit test for the = supercolumns which seem to have problems, I posted a few examples = below.  As I mentioned, the retrieved bytes for the name and value = appear to have additional data; in previous tests the buffer's position, = mark, and limit have been verified, and when I call column.getName(), = just the bytes for the name itself are properly retrieved(if not I = should be getting validation errors for the custom uuid types, = correct?).

Abe Sanderson

get_slice for key: = 80324d09-302b-4093-9708-e509091e5d8f supercolumn: = 0faced00057372002a6c696e676f74656b2e646f6373746f72652e43617373616e64726144= 6f63756d656e74245461726765749d0b9f071f4cb0410200024900076d5f70686173654c00= 066d5f6c616e677400124c6a6176612f6c616e672f537472696e673b787000000001740005= 64655f4445 subcolumn: [ cf=3D"TranslationsByTarget" = name=3D"78cfd525-a520-458e-8584-259415b88405"]
colParent:ColumnParent(column_family:TranslationsByTarget, = super_column:0F AC ED 00 05 73 72 00 2A 6C 69 6E 67 6F 74 65 6B 2E 64 6F = 63 73 74 6F 72 65 2E 43 61 73 73 61 6E 64 72 61 44 6F 63 75 6D 65 6E 74 = 24 54 61 72 67 65 74 9D 0B 9F 07 1F 4C B0 41 02 00 02 49 00 07 6D 5F 70 = 68 61 73 65 4C 00 06 6D 5F 6C 61 6E 67 74 00 12 4C 6A 61 76 61 2F 6C 61 = 6E 67 2F 53 74 72 69 6E 67 3B 78 70 00 00 00 01 74 00 05 64 65 5F 44 = 45)
predicate:SlicePredicate(column_names:[java.nio.HeapByteBuffer[pos=3D0 = lim=3D17 cap=3D17]])
col: ColumnOrSuperColumn(column:Column(name:80 = 01 00 02 00 00 00 09 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C = 00 00 00 04 0C 00 01 0B 00 01 00 00 00 11 10 49 5D 01 32 73 0D 48 03 85 = 09 CA F1 AF 6F 60 63, value:80 01 00 02 00 00 00 09 67 65 74 5F 73 6C 69 = 63 65 00 00 00 02 0F 00 00 0C 00 00 00 04 0C 00 01 0B 00 01 00 00 00 11 = 10 49 5D 01 32 73 0D 48 03 85 09 CA F1 AF 6F 60 63 0B 00 02 00 00 00 11 = 10 FC 0A 0D 43 B1 E0 44 F9 96 AA FC EE 41 EC 40 7E, = timestamp:1301327609539))
col: ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 67 = 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 04 0C 00 01 0B = 00 01 00 00 00 11 10 49 5D 01 32 73 0D 48 03 85 09 CA F1 AF 6F 60 63 0B = 00 02 00 00 00 11 10 FC 0A 0D 43 B1 E0 44 F9 96 AA FC EE 41 EC 40 7E 0A = 00 03 00 00 01 2E FD 2B 7E C3 00 00 0C 00 01 0B 00 01 00 00 00 11 10 78 = CF D5 25 A5 20 45 8E 85 84 25 94 15 B8 84 05, = value:80 01 00 02 00 00 00 09 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F = 00 00 0C 00 00 00 04 0C 00 01 0B 00 01 00 00 00 11 10 49 5D 01 32 73 0D = 48 03 85 09 CA F1 AF 6F 60 63 0B 00 02 00 00 00 11 10 FC 0A 0D 43 B1 E0 = 44 F9 96 AA FC EE 41 EC 40 7E 0A 00 03 00 00 01 2E FD 2B 7E C3 00 00 0C = 00 01 0B 00 01 00 00 00 11 10 78 CF D5 25 A5 20 45 8E 85 84 25 94 15 B8 84 05 0B 00 02 00 00 00 11 = 10..., timestamp:1301327602293))
col: ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 67 = 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 04 0C 00 01 0B = 00 01 00 00 00 11 10 49 5D 01 32 73 0D 48 03 85 09 CA F1 AF 6F 60 63 0B = 00 02 00 00 00 11 10 FC 0A 0D 43 B1 E0 44 F9 96 AA FC EE 41 EC 40 7E 0A = 00 03 00 00 01 2E FD 2B 7E C3 00 00 0C 00 01 0B 00 01 00 00 00 11 10 78 = CF D5 25 A5 20 45 8E 85 84 25 94 15 B8 84 05 0B = 00 02 00 00 00 11 10..., value:80 01 00 02 00 00 00 09 67 65 74 5F 73 6C = 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 04 0C 00 01 0B 00 01 00 00 00 = 11 10 49 5D 01 32 73 0D 48 03 85 09 CA F1 AF 6F 60 63 0B 00 02 00 00 00 = 11 10 FC 0A 0D 43 B1 E0 44 F9 96 AA FC EE 41 EC 40 7E 0A 00 03 00 00 01 = 2E FD 2B 7E C3 00 00 0C 00 01 0B 00 01 00 00 00 11 10 78 CF D5 25 A5 20 = 45 8E 85 84 25 94 15 B8 84 05 0B 00 02 00 00 00 11 = 10..., timestamp:1301327589704))
col: ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 67 = 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 04 0C 00 01 0B = 00 01 00 00 00 11 10 49 5D 01 32 73 0D 48 03 85 09 CA F1 AF 6F 60 63 0B = 00 02 00 00 00 11 10 FC 0A 0D 43 B1 E0 44 F9 96 AA FC EE 41 EC 40 7E 0A = 00 03 00 00 01 2E FD 2B 7E C3 00 00 0C 00 01 0B 00 01 00 00 00 11 10 78 = CF D5 25 A5 20 45 8E 85 84 25 94 15 B8 84 05 0B = 00 02 00 00 00 11 10..., value:80 01 00 02 00 00 00 09 67 65 74 5F 73 6C = 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 04 0C 00 01 0B 00 01 00 00 00 = 11 10 49 5D 01 32 73 0D 48 03 85 09 CA F1 AF 6F 60 63 0B 00 02 00 00 00 = 11 10 FC 0A 0D 43 B1 E0 44 F9 96 AA FC EE 41 EC 40 7E 0A 00 03 00 00 01 = 2E FD 2B 7E C3 00 00 0C 00 01 0B 00 01 00 00 00 11 10 78 CF D5 25 A5 20 = 45 8E 85 84 25 94 15 B8 84 05 0B 00 02 00 00 00 11 = 10..., timestamp:1301327594118))


get_slice for key: d1c7f6b9-1425-4fab-b074-5574c54cae08 = supercolumn: = 0faced00057372002a6c696e676f74656b2e646f6373746f72652e43617373616e64726144= 6f63756d656e74245461726765749d0b9f071f4cb0410200024900076d5f70686173654c00= 066d5f6c616e677400124c6a6176612f6c616e672f537472696e673b787000000001740005= 64655f4445 subcolumn: [ cf=3D"TranslationsByTarget" = name=3D"b2f33b97-69f4-45ec-ad87-dd14ee60d719"]
colParent:ColumnParent(column_family:TranslationsByTarget, = super_column:0F AC ED 00 05 73 72 00 2A 6C 69 6E 67 6F 74 65 6B 2E 64 6F = 63 73 74 6F 72 65 2E 43 61 73 73 61 6E 64 72 61 44 6F 63 75 6D 65 6E 74 = 24 54 61 72 67 65 74 9D 0B 9F 07 1F 4C B0 41 02 00 02 49 00 07 6D 5F 70 = 68 61 73 65 4C 00 06 6D 5F 6C 61 6E 67 74 00 12 4C 6A 61 76 61 2F 6C 61 = 6E 67 2F 53 74 72 69 6E 67 3B 78 70 00 00 00 01 74 00 05 64 65 5F 44 = 45)
predicate:SlicePredicate(column_names:[java.nio.HeapByteBuffer[pos=3D0 = lim=3D17 cap=3D17]])
col: ColumnOrSuperColumn(column:Column(name:80 = 01 00 02 00 00 00 09 67 65 74 5F 73 6C 69 63 65 00 00 00 04 0F 00 00 0C = 00 00 00 02 0C 00 01 0B 00 01 00 00 00 11 10 7C 2F 5D 5B B3 70 42 E1 A6 = A2 77 FC 72 14 40 FE, value:80 01 00 02 00 00 00 09 67 65 74 5F 73 6C 69 = 63 65 00 00 00 04 0F 00 00 0C 00 00 00 02 0C 00 01 0B 00 01 00 00 00 11 = 10 7C 2F 5D 5B B3 70 42 E1 A6 A2 77 FC 72 14 40 FE 0B 00 02 00 00 00 11 = 10 B4 64 74 19 F9 44 4E A3 A5 F9 06 32 67 DB 33 19, = timestamp:1301324860465))
col: ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 67 = 65 74 5F 73 6C 69 63 65 00 00 00 04 0F 00 00 0C 00 00 00 02 0C 00 01 0B = 00 01 00 00 00 11 10 7C 2F 5D 5B B3 70 42 E1 A6 A2 77 FC 72 14 40 FE 0B = 00 02 00 00 00 11 10 B4 64 74 19 F9 44 4E A3 A5 F9 06 32 67 DB 33 19 0A = 00 03 00 00 01 2E FD 01 8C 31 00 00 0C 00 01 0B 00 01 00 = 00 00 11 10 B2 F3 3B 97 69 F4 45 EC AD 87 DD 14 EE 60 D7 19, value:80 01 = 00 02 00 00 00 09 67 65 74 5F 73 6C 69 63 65 00 00 00 04 0F 00 00 0C 00 = 00 00 02 0C 00 01 0B 00 01 00 00 00 11 10 7C 2F 5D 5B B3 70 42 E1 A6 A2 = 77 FC 72 14 40 FE 0B 00 02 00 00 00 11 10 B4 64 74 19 F9 44 4E A3 A5 F9 = 06 32 67 DB 33 19 0A 00 03 00 00 01 2E FD 01 8C 31 = 00 00 0C 00 01 0B 00 01 00 00 00 11 10 B2 F3 3B 97 69 F4 45 EC AD 87 = DD 14 EE 60 D7 19 0B 00 02 00 00 00 11 10..., = timestamp:1301325719735))


get_slice for key: 18b4acd1-5491-44d3-aaa1-b725f51d1c3b = supercolumn: = 0faced00057372002a6c696e676f74656b2e646f6373746f72652e43617373616e64726144= 6f63756d656e74245461726765749d0b9f071f4cb0410200024900076d5f70686173654c00= 066d5f6c616e677400124c6a6176612f6c616e672f537472696e673b787000000001740005= 706c5f504c subcolumn: [ cf=3D"TranslationsByTarget" = name=3D"3da78c49-a8aa-4fdb-8238-1ade458426b5"]
colParent:ColumnParent(column_family:TranslationsByTarget, = super_column:0F AC ED 00 05 73 72 00 2A 6C 69 6E 67 6F 74 65 6B 2E 64 6F = 63 73 74 6F 72 65 2E 43 61 73 73 61 6E 64 72 61 44 6F 63 75 6D 65 6E 74 = 24 54 61 72 67 65 74 9D 0B 9F 07 1F 4C B0 41 02 00 02 49 00 07 6D 5F 70 = 68 61 73 65 4C 00 06 6D 5F 6C 61 6E 67 74 00 12 4C 6A 61 76 61 2F 6C 61 = 6E 67 2F 53 74 72 69 6E 67 3B 78 70 00 00 00 01 74 00 05 70 6C 5F 50 = 4C)
predicate:SlicePredicate(column_names:[java.nio.HeapByteBuffer[pos=3D0 = lim=3D17 cap=3D17]])
col: ColumnOrSuperColumn(column:Column(name:80 = 01 00 02 00 00 00 09 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C = 00 00 00 03 0C 00 01 0B 00 01 00 00 00 11 10 24 D4 2C 7F 2D C3 4A 80 B3 = FF 5B A3 77 AF 2E BD, value:80 01 00 02 00 00 00 09 67 65 74 5F 73 6C 69 = 63 65 00 00 00 02 0F 00 00 0C 00 00 00 03 0C 00 01 0B 00 01 00 00 00 11 = 10 24 D4 2C 7F 2D C3 4A 80 B3 FF 5B A3 77 AF 2E BD 0B 00 02 00 00 00 11 = 10 62 58 73 23 CB 37 4F B5 BD DD BC F5 1E 7F E7 65, = timestamp:1301000346861))
col: ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 67 = 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 03 0C 00 01 0B = 00 01 00 00 00 11 10 24 D4 2C 7F 2D C3 4A 80 B3 FF 5B A3 77 AF 2E BD 0B = 00 02 00 00 00 11 10 62 58 73 23 CB 37 4F B5 BD DD BC F5 1E 7F E7 65 0A = 00 03 00 00 01 2E E9 A9 DC ED 00 00 0C 00 01 0B 00 01 00 00 00 11 10 3D = A7 8C 49 A8 AA 4F DB 82 38 1A DE 45 84 26 B5, value:80 01 00 02 00 00 00 = 09 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 03 0C 00 = 01 0B 00 01 00 00 00 11 10 24 D4 2C 7F 2D C3 4A 80 B3 FF 5B A3 77 AF 2E = BD 0B 00 02 00 00 00 11 10 62 58 73 23 CB 37 4F B5 BD DD BC F5 1E 7F E7 = 65 0A 00 03 00 00 01 2E E9 A9 DC ED 00 00 0C 00 01 0B 00 01 00 00 00 11 = 10 3D A7 8C 49 A8 AA 4F DB 82 38 1A DE 45 84 26 B5 0B 00 02 00 00 00 11 = 10..., timestamp:1301000346885))
col: ColumnOrSuperColumn(column:Column(name:80 01 00 02 00 00 00 09 67 = 65 74 5F 73 6C 69 63 65 00 00 00 02 0F 00 00 0C 00 00 00 03 0C 00 01 0B = 00 01 00 00 00 11 10 24 D4 2C 7F 2D C3 4A 80 B3 FF 5B A3 77 AF 2E BD 0B = 00 02 00 00 00 11 10 62 58 73 23 CB 37 4F B5 BD DD BC F5 1E 7F E7 65 0A = 00 03 00 00 01 2E E9 A9 DC ED 00 00 0C 00 01 0B 00 01 00 00 00 11 10 3D = A7 8C 49 A8 AA 4F DB 82 38 1A DE 45 84 26 B5 0B 00 02 00 00 00 11 10..., = value:80 01 00 02 00 00 00 09 67 65 74 5F 73 6C 69 63 65 00 00 00 02 0F = 00 00 0C 00 00 00 03 0C 00 01 0B 00 01 00 00 00 11 10 24 D4 2C 7F 2D C3 = 4A 80 B3 FF 5B A3 77 AF 2E BD 0B 00 02 00 00 00 11 10 62 58 73 23 CB 37 = 4F B5 BD DD BC F5 1E 7F E7 65 0A 00 03 00 00 01 2E E9 A9 DC ED 00 00 0C = 00 01 0B 00 01 00 00 00 11 10 3D A7 8C 49 A8 AA 4F DB 82 38 1A DE 45 84 = 26 B5 0B 00 02 00 00 00 11 10..., timestamp:1301000346836))

On Mon, Apr 18, 2011 at 5:41 PM, aaron = morton <aaron@thelastpickle.com> = wrote:
Can you could provide an = example of a get_slice request that failed and the columns that were = returned, so we can see the actual bytes for the super column and column = names.

Aaron


On 19 Apr 2011, at 09:26, Abraham Sanderson = wrote:

I wish it were consistent = enough that the answer were simple...  It varies between just the = requested subcolumn to all subcolumns.  It always does return the = columns in order, and the requested column is always one of the columns = returned.   However, the slice start is not consistently in = the same place(like n+1 or n-1).  For example, if I have = CF['key']['supercolumn' ['a','b','c','d','e']], and query for 'c', = sometimes i get a slice with 'a', 'b', 'c', other times its 'b', 'c', = 'd', sometimes 'c', 'd'.  When the column name is closer to the end = of the range('d' or 'e'), sometimes it justs a slice with the = column.  The sporadic behavior makes me think that it's a race = condition, but the behavior linked to the column range makes we think = I'm overrunning the buffer somewhere.  I at first suspected that I = was inadvertently making modifications to the buffers in application = code during serialization/deserialization, so I did the tests in the = cli.  This limits it to just cassandra/thrift code and my custom = types.  Am I missing some other factor?  While debugging I = have noticed that the byte buffers contain more than they used to; it = looks to me like tokens that contain parts of the thrift response.  = I'd see strings like = "???get_slice???Foo??7c2f5d5b-b370-42e1-a6a2-77fc721440fe????"  Is = it possible that I am inadvertently using a reserved token or something = on my supercolumn name and this is screwing with the slice command?

Abe

On Mon, Apr 18, 2011 at 2:55 = PM, aaron morton <aaron@thelastpickle.com> = wrote:
When you run the get_slice which = columns are returned ? 


Aaron

On 19 Apr 2011, at 04:12, Abraham Sanderson = wrote:

Ok, I made the changes and tried = again.  Here is the before modifying my method using a simple get, = confirmed the same output in the cli:

DEBUG [pool-1-thread-2] = 2011-04-18 09:37:23,910 CassandraServer.java (line 279) get
DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,911 StorageProxy.java (line = 322) Command/ConsistencyLevel is = SliceByNamesReadCommand(table=3D'DocStore', = key=3D64316337663662392d313432352d346661622d623037342d35353734633534636165= 3038, columnParent=3D'QueryPath(columnFamilyName=3D'Tran
slationsByTarget', superColumnName=3D'java.nio.HeapByteBuffer[pos=3D95 = lim=3D211 cap=3D244]', columnName=3D'null')', = columns=3D[7c2f5d5b-b370-42e1-a6a2-77fc721440fe,])/ALL
DEBUG = [pool-1-thread-2] 2011-04-18 09:37:23,911 ReadCallback.java (line 84) = Blockfor/repair is 1/true; setting up requests to localhost/127.0.0.1
DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,911 StorageProxy.java (line = 345) reading data locally
DEBUG [ReadStage:4] 2011-04-18 09:37:23,911 = StorageProxy.java (line 450) LocalReadRunnable reading = SliceByNamesReadCommand(table=3D'DocStore', = key=3D64316337663662392d313432352d346661622d623037342d35353734633534636165= 3038, columnParent=3D'QueryPath(columnFamilyName=3D'Translatio
nsByTarget', superColumnName=3D'java.nio.HeapByteBuffer[pos=3D95 lim=3D211= cap=3D244]', columnName=3D'null')', = columns=3D[7c2f5d5b-b370-42e1-a6a2-77fc721440fe,])
DEBUG = [pool-1-thread-2] 2011-04-18 09:37:23,912 StorageProxy.java (line 395) = Read: 1 ms.
ERROR [pool-1-thread-2] 2011-04-18 09:37:23,912 Cassandra.java (line = 2665) Internal error processing = get
java.lang.AssertionError
      &nb= sp; at = org.apache.cassandra.thrift.CassandraServer.get(CassandraServer.java:300)<= br>         at = org.apache.cassandra.thrift.Cassandra$Processor$get.process(Cassandra.java= :2655)
        at = org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:255= 5)
        at = org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(Cust= omTThreadPoolServer.java:206)
        at = java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:= 1110)
        at = java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java= :603)
        at = java.lang.Thread.run(Thread.java:636)

And here is the after...it succeeds here but still gives me multiple = subcolumns in the response.  Same behavior, it seems, I'm just = sidestepping the original AssertionError:

DEBUG [pool-1-thread-6] = 2011-04-18 09:50:26,617 CassandraServer.java (line 232) get_slice
DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,617 StorageProxy.java (line = 322) Command/ConsistencyLevel is = SliceByNamesReadCommand(table=3D'DocStore', = key=3D64316337663662392d313432352d346661622d623037342d35353734633534636165= 3038, columnParent=3D'QueryPath(columnFamilyName=3D'TranslationsByTarget',= superColumnName=3D'java.nio.HeapByteBuffer[pos=3D101 lim=3D217 = cap=3D259]', columnName=3D'null')', = columns=3D[7c2f5d5b-b370-42e1-a6a2-77fc721440fe,])/ALL
DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,617 ReadCallback.java (line = 84) Blockfor/repair is 1/true; setting up requests to localhost/127.0.0.1
DEBUG = [pool-1-thread-6] 2011-04-18 09:50:26,617 StorageProxy.java (line 345) = reading data locally
DEBUG [ReadStage:3] 2011-04-18 09:50:26,618 StorageProxy.java (line 450) = LocalReadRunnable reading SliceByNamesReadCommand(table=3D'DocStore', = key=3D64316337663662392d313432352d346661622d623037342d35353734633534636165= 3038, columnParent=3D'QueryPath(columnFamilyName=3D'TranslationsByTarget',= superColumnName=3D'java.nio.HeapByteBuffer[pos=3D101 lim=3D217 = cap=3D259]', columnName=3D'null')', = columns=3D[7c2f5d5b-b370-42e1-a6a2-77fc721440fe,])
DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,618 StorageProxy.java (line = 395) Read: 0 ms.

My comparators are relatively simple.  = Basically I have a schema that required heterogenous columns, but I = needed to be able to deserialize them in unique ways.  So there is = always a type byte that precedes the bytes of the data.  The = supercolumn in this case is a general data type, which happens to = represent a serializable object:

  public void validate(ByteBuffer bytes)
    = throws MarshalException
  {
    = if(bytes.remaining() =3D=3D 0)
      = return;

    = validateDataType(bytes.get(bytes.position()));
    = return;
  }

  public int compare(ByteBuffer bytes1, ByteBuffer = bytes2)
  {
    if (bytes1.remaining() =3D=3D = 0)
      return bytes2.remaining() =3D=3D 0 = ? 0 : -1;
    else if (bytes2.remaining() =3D=3D = 0)
      return 1;
    = else
    {
      // compare type = bytes           &nb= sp;            = ;            &= nbsp;           &nb= sp;            = ;            &= nbsp;           &nb= sp;            = ;            &= nbsp;           &nb= sp;            = ;            &= nbsp;           &nb= sp;            = ;            &= nbsp;           &nb= sp;            = ;            &= nbsp;           &nb= sp;         
      byte T1 =3D = bytes1.get(bytes1.position());
      byte T2 = =3D bytes2.get(bytes2.position());
      if = (T1 !=3D T2)
        return (T1 - = T2);

      // compare = values           &n= bsp;           &nbs= p;            =             &n= bsp;           &nbs= p;            =             &n= bsp;           &nbs= p;            =             &n= bsp;           &nbs= p;            =             &n= bsp;           &nbs= p;            =             &n= bsp;           &nbs= p;            =             &n= bsp;           &nbs= p; 
      return = ByteBufferUtil.compareUnsigned(bytes1, bytes2);
    = }
  }

The subcolumn is similar...just a UUID with a type = byte prefix:

  public void validate(ByteBuffer = bytes)
    throws MarshalException
  {
    if(bytes.remaining() =3D=3D = 0)
      return;

    = validateDataType(bytes.get(bytes.position()));
    = if((bytes.remaining() - 1) =3D=3D 0)
      = return;
    else if((bytes.remaining() - 1) !=3D = 16)
      throw new MarshalException("UUID = value must be exactly 16 bytes");
  }

  public int compare(ByteBuffer bytes1, ByteBuffer = bytes2)
  {
    if (bytes1.remaining() =3D=3D = 0)
      return bytes2.remaining() =3D=3D 0 = ? 0 : -1;
    else if (bytes2.remaining() =3D=3D = 0)
      return 1;
    else
    = {
      // compare type = bytes           &nb= sp;            = ;            &= nbsp;           &nb= sp;            = ;            &= nbsp;           &nb= sp;            = ;            &= nbsp;           &nb= sp;            = ;            &= nbsp;           &nb= sp;            = ;            &= nbsp;           &nb= sp;            = ;            &= nbsp;           &nb= sp;         
      byte T1 =3D = bytes1.get(bytes1.position());
      byte T2 = =3D bytes2.get(bytes2.position());
      if = (T1 !=3D T2)
        return (T1 - = T2);

      // compare = values           &n= bsp;           &nbs= p;            =             &n= bsp;           &nbs= p;            =             &n= bsp;           &nbs= p;            =             &n= bsp;           &nbs= p;            =             &n= bsp;           &nbs= p;            =             &n= bsp;           &nbs= p;            =             &n= bsp;           &nbs= p; 
      UUID U1 =3D getUUID(bytes1, = bytes1.position()+1);
      UUID U2 =3D = getUUID(bytes2, bytes2.position()+1);
      = return U1.compareTo(U2);
    }
  = }

  static UUID getUUID(ByteBuffer bytes, int pos)
  {
    long msBits =3D = bytes.getLong(pos);
    long lsBits =3D = bytes.getLong(pos+8);
    return new UUID(msBits, = lsBits);
  }

All of my buffer reads are done by index, = the position shouldn't be changing at all.

Abe Sanderson

On Sat, Apr 16, 2011 = at 5:38 PM, aaron morton <aaron@thelastpickle.com> wrote:
Can you run the same request as a get_slice naming the column in the = SlicePredicate and see what comes back ?

Can you reproduce the fault with logging set at DEBUG and send the logs = ?

Also, whats the compare function like for your custom type ?

Cheers
Aaron


On 16 Apr 2011, at 07:34, Abraham Sanderson wrote:

> I'm having some issues with a few of my ColumnFamilies after a = cassandra upgrade/import from 0.6.1 to 0.7.4.  I followed the = instructions to upgrade and everything seem to work OK...until I got = into the application and noticed some wierd behavior.  I was = getting the following stacktrace in cassandra occassionally when I did = get operations for a single subcolumn for some of the Super type = CFs:
>
> ERROR 12:56:05,669 Internal error processing get
> java.lang.AssertionError
>         at org.apache.cassandra.thrift.
> CassandraServer.get(CassandraServer.java:300)
>         at = org.apache.cassandra.thrift.Cassandra$Processor$get.process(Cassandra.java= :2655)
>         at = org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:255= 5)
>         at = org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(Cust= omTThreadPoolServer.java:206)
>         at = java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:= 1110)
>         at = java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java= :603)
>         at = java.lang.Thread.run(Thread.java:636)
>
> The assertion that is failing is the check that only one column is = retrieved by the get.  I did some debugging with the cli and a = remote  debugger and found a few interesting patterns.  First, = the problem does not seem consistently duplicatable.  If one = supercolumn is affected though, it will happen more frequently for = subcolumns that when sorted appear at the beginning of the range. =  For columns near the end of the range, it seems to be more = intermittent, and almost never occurs when I step through the code line = by line.  The only factor I can think of that might cause issues is = that I am using custom data types for all supercolumns and columns. =  I originally thought I might be reading past the end of the = ByteBuffer, but I have quadrupled checked that this is not the case.
>
> Abe Sanderson










= --Apple-Mail-5--929460600--