incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: binary data in key names?
Date Sun, 28 Feb 2010 02:02:29 GMT
Keys are strings.  That means they have to be UTF8-encoded, although
thrift bindings for many languages (including python) don't help you
with this.

On Sat, Feb 27, 2010 at 7:43 PM, Robert Edmonds <edmonds@debian.org> wrote:
> hi,
>
> i'm using cassandra 0.5.0 and pycassa 0.1.
>
> i'd like to store binary data (specifically, wire-format DNS names) in
> key names.  is this possible with cassandra?
>
> i have a keyspace set up like this:
>
>    <Keyspace Name="DNS">
>        <KeysCachedFraction>0.01</KeysCachedFraction>
>        <ColumnFamily CompareWith="BytesType"
>                      CompareSubcolumnsWith="BytesType"
>                      ColumnType="Super"
>                      Name="RRsets"/>
>    </Keyspace>
>
> when i insert a key with a printable ASCII name, everything works fine,
> but when i insert a key with a name containing binary data the insertion
> appears to succeed, but a subsequent get for the same key name fails.
>
> here's a test case:
>
>    import pycassa
>    conn = pycassa.connect()
>    cf = pycassa.ColumnFamily(conn, 'DNS', 'RRsets', super=True)
>
>    cf.insert('\x03www\x07example\x03com\x00', {'foo': {'bar' : 'baz'}})
>    cf.get('\x03www\x07example\x03com\x00') # <-- this fails
>
>    cf.insert('www.example.com.', {'foo': {'bar' : 'baz'}})
>    cf.get('www.example.com.') # <-- this succeeds
>
> here's the same code being executed in ipython, with each python line
> interleaved with the hex dump of the TCP conversation.  (the client request hex
> dump is flush, the server response hex dump is indented.)
>
> (i don't know how the cassandra thrift protocol works, but the exact same
> response was sent on the wire for the successful and the unsuccessful
> insertions, so i would assume that either the insertion succeeded or the
> protocol doesn't indicate insertion success?)
>
> In [1]: import pycassa
>
> In [2]: conn = pycassa.connect()
>
> In [3]: cf = pycassa.ColumnFamily(conn, 'DNS', 'RRsets', super=True)
>
> ----------------------------------------------------------------------------
>
> In [4]: cf.insert('\x03www\x07example\x03com\x00', {'foo': {'bar' : 'baz'}})
> Out[4]: 1267319528
>
> 00000000  80 01 00 01 00 00 00 0c  62 61 74 63 68 5f 69 6e ........ batch_in
> 00000010  73 65 72 74 00 00 00 00  0b 00 01 00 00 00 03 44 sert.... .......D
> 00000020  4e 53 0b 00 02 00 00 00  11 03 77 77 77 07 65 78 NS...... ..www.ex
> 00000030  61 6d 70 6c 65 03 63 6f  6d 00 0d 00 03 0b 0f 00 ample.co m.......
> 00000040  00 00 01 00 00 00 06 52  52 73 65 74 73 0c 00 00 .......R Rsets...
> 00000050  00 01 0c 00 02 0b 00 01  00 00 00 03 66 6f 6f 0f ........ ....foo.
> 00000060  00 02 0c 00 00 00 01 0b  00 01 00 00 00 03 62 61 ........ ......ba
> 00000070  72 0b 00 02 00 00 00 03  62 61 7a 0a 00 03 00 00 r....... baz.....
> 00000080  00 00 4b 89 c2 e8 00 00  00 08 00 04 00 00 00 00 ..K..... ........
> 00000090  00                                               .
>
>    00000000  80 01 00 02 00 00 00 0c  62 61 74 63 68 5f 69 6e ........ batch_in
>    00000010  73 65 72 74 00 00 00 00  00                      sert....
.
>
> ----------------------------------------------------------------------------
>
> In [5]: cf.get('\x03www\x07example\x03com\x00')
> NotFoundException: NotFoundException()
>
> 00000091  80 01 00 01 00 00 00 09  67 65 74 5f 73 6c 69 63 ........ get_slic
> 000000A1  65 00 00 00 00 0b 00 01  00 00 00 03 44 4e 53 0b e....... ....DNS.
> 000000B1  00 02 00 00 00 11 03 77  77 77 07 65 78 61 6d 70 .......w ww.examp
> 000000C1  6c 65 03 63 6f 6d 00 0c  00 03 0b 00 03 00 00 00 le.com.. ........
> 000000D1  06 52 52 73 65 74 73 00  0c 00 04 0c 00 02 0b 00 .RRsets. ........
> 000000E1  01 00 00 00 00 0b 00 02  00 00 00 00 02 00 03 00 ........ ........
> 000000F1  08 00 04 00 00 00 64 00  00 08 00 05 00 00 00 01 ......d. ........
> 00000101  00                                               .
>
>    00000019  80 01 00 02 00 00 00 09  67 65 74 5f 73 6c 69 63 ........ get_slic
>    00000029  65 00 00 00 00 0f 00 00  0c 00 00 00 00 00       e....... ......
>
> ----------------------------------------------------------------------------
>
> In [6]: cf.insert('www.example.com.', {'foo': {'bar' : 'baz'}})
> Out[6]: 1267319533
>
> 00000102  80 01 00 01 00 00 00 0c  62 61 74 63 68 5f 69 6e ........ batch_in
> 00000112  73 65 72 74 00 00 00 00  0b 00 01 00 00 00 03 44 sert.... .......D
> 00000122  4e 53 0b 00 02 00 00 00  10 77 77 77 2e 65 78 61 NS...... .www.exa
> 00000132  6d 70 6c 65 2e 63 6f 6d  2e 0d 00 03 0b 0f 00 00 mple.com ........
> 00000142  00 01 00 00 00 06 52 52  73 65 74 73 0c 00 00 00 ......RR sets....
> 00000152  01 0c 00 02 0b 00 01 00  00 00 03 66 6f 6f 0f 00 ........ ...foo..
> 00000162  02 0c 00 00 00 01 0b 00  01 00 00 00 03 62 61 72 ........ .....bar
> 00000172  0b 00 02 00 00 00 03 62  61 7a 0a 00 03 00 00 00 .......b az......
> 00000182  00 4b 89 c2 ed 00 00 00  08 00 04 00 00 00 00 00 .K...... ........
>
>    00000037  80 01 00 02 00 00 00 0c  62 61 74 63 68 5f 69 6e ........ batch_in
>    00000047  73 65 72 74 00 00 00 00  00                      sert....
.
>
> ----------------------------------------------------------------------------
>
> In [7]: cf.get('www.example.com.')
> Out[7]: {'foo': {'bar': 'baz'}}
>
> 00000192  80 01 00 01 00 00 00 09  67 65 74 5f 73 6c 69 63 ........ get_slic
> 000001A2  65 00 00 00 00 0b 00 01  00 00 00 03 44 4e 53 0b e....... ....DNS.
> 000001B2  00 02 00 00 00 10 77 77  77 2e 65 78 61 6d 70 6c ......ww w.exampl
> 000001C2  65 2e 63 6f 6d 2e 0c 00  03 0b 00 03 00 00 00 06 e.com... ........
> 000001D2  52 52 73 65 74 73 00 0c  00 04 0c 00 02 0b 00 01 RRsets.. ........
> 000001E2  00 00 00 00 0b 00 02 00  00 00 00 02 00 03 00 08 ........ ........
> 000001F2  00 04 00 00 00 64 00 00  08 00 05 00 00 00 01 00 .....d.. ........
>
>    00000050  80 01 00 02 00 00 00 09  67 65 74 5f 73 6c 69 63 ........ get_slic
>    00000060  65 00 00 00 00 0f 00 00  0c 00 00 00 01 0c 00 02 e....... ........
>    00000070  0b 00 01 00 00 00 03 66  6f 6f 0f 00 02 0c 00 00 .......f oo......
>    00000080  00 01 0b 00 01 00 00 00  03 62 61 72 0b 00 02 00 ........ .bar....
>    00000090  00 00 03 62 61 7a 0a 00  03 00 00 00 00 4b 89 c2 ...baz.. .....K..
>    000000A0  ed 00 00 00 00
>
> ----------------------------------------------------------------------------
>
> --
> Robert Edmonds
> edmonds@debian.org
>
>

Mime
View raw message