cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fi Dot <fi.dot....@gmail.com>
Subject Re: binary data in key names?
Date Sun, 28 Feb 2010 02:34:21 GMT
Use base64 or similar encodings maybe?

Fi.


On Sat, Feb 27, 2010 at 6:02 PM, Jonathan Ellis <jbellis@gmail.com> wrote:

> Keys are strings.  That means they have to be UTF8-encoded, although
> thrift bindings for many languages (including python) don't help you
> with this.
>
> On Sat, Feb 27, 2010 at 7:43 PM, Robert Edmonds <edmonds@debian.org>
> wrote:
> > hi,
> >
> > i'm using cassandra 0.5.0 and pycassa 0.1.
> >
> > i'd like to store binary data (specifically, wire-format DNS names) in
> > key names.  is this possible with cassandra?
> >
> > i have a keyspace set up like this:
> >
> >    <Keyspace Name="DNS">
> >        <KeysCachedFraction>0.01</KeysCachedFraction>
> >        <ColumnFamily CompareWith="BytesType"
> >                      CompareSubcolumnsWith="BytesType"
> >                      ColumnType="Super"
> >                      Name="RRsets"/>
> >    </Keyspace>
> >
> > when i insert a key with a printable ASCII name, everything works fine,
> > but when i insert a key with a name containing binary data the insertion
> > appears to succeed, but a subsequent get for the same key name fails.
> >
> > here's a test case:
> >
> >    import pycassa
> >    conn = pycassa.connect()
> >    cf = pycassa.ColumnFamily(conn, 'DNS', 'RRsets', super=True)
> >
> >    cf.insert('\x03www\x07example\x03com\x00', {'foo': {'bar' : 'baz'}})
> >    cf.get('\x03www\x07example\x03com\x00') # <-- this fails
> >
> >    cf.insert('www.example.com.', {'foo': {'bar' : 'baz'}})
> >    cf.get('www.example.com.') # <-- this succeeds
> >
> > here's the same code being executed in ipython, with each python line
> > interleaved with the hex dump of the TCP conversation.  (the client
> request hex
> > dump is flush, the server response hex dump is indented.)
> >
> > (i don't know how the cassandra thrift protocol works, but the exact same
> > response was sent on the wire for the successful and the unsuccessful
> > insertions, so i would assume that either the insertion succeeded or the
> > protocol doesn't indicate insertion success?)
> >
> > In [1]: import pycassa
> >
> > In [2]: conn = pycassa.connect()
> >
> > In [3]: cf = pycassa.ColumnFamily(conn, 'DNS', 'RRsets', super=True)
> >
> >
> ----------------------------------------------------------------------------
> >
> > In [4]: cf.insert('\x03www\x07example\x03com\x00', {'foo': {'bar' :
> 'baz'}})
> > Out[4]: 1267319528
> >
> > 00000000  80 01 00 01 00 00 00 0c  62 61 74 63 68 5f 69 6e ........
> batch_in
> > 00000010  73 65 72 74 00 00 00 00  0b 00 01 00 00 00 03 44 sert....
> .......D
> > 00000020  4e 53 0b 00 02 00 00 00  11 03 77 77 77 07 65 78 NS......
> ..www.ex
> > 00000030  61 6d 70 6c 65 03 63 6f  6d 00 0d 00 03 0b 0f 00 ample.com.......
> > 00000040  00 00 01 00 00 00 06 52  52 73 65 74 73 0c 00 00 .......R
> Rsets...
> > 00000050  00 01 0c 00 02 0b 00 01  00 00 00 03 66 6f 6f 0f ........
> ....foo.
> > 00000060  00 02 0c 00 00 00 01 0b  00 01 00 00 00 03 62 61 ........
> ......ba
> > 00000070  72 0b 00 02 00 00 00 03  62 61 7a 0a 00 03 00 00 r.......
> baz.....
> > 00000080  00 00 4b 89 c2 e8 00 00  00 08 00 04 00 00 00 00 ..K.....
> ........
> > 00000090  00                                               .
> >
> >    00000000  80 01 00 02 00 00 00 0c  62 61 74 63 68 5f 69 6e ........
> batch_in
> >    00000010  73 65 72 74 00 00 00 00  00                      sert.... .
> >
> >
> ----------------------------------------------------------------------------
> >
> > In [5]: cf.get('\x03www\x07example\x03com\x00')
> > NotFoundException: NotFoundException()
> >
> > 00000091  80 01 00 01 00 00 00 09  67 65 74 5f 73 6c 69 63 ........
> get_slic
> > 000000A1  65 00 00 00 00 0b 00 01  00 00 00 03 44 4e 53 0b e.......
> ....DNS.
> > 000000B1  00 02 00 00 00 11 03 77  77 77 07 65 78 61 6d 70 .......w
> ww.examp
> > 000000C1  6c 65 03 63 6f 6d 00 0c  00 03 0b 00 03 00 00 00 le.com..
> ........
> > 000000D1  06 52 52 73 65 74 73 00  0c 00 04 0c 00 02 0b 00 .RRsets.
> ........
> > 000000E1  01 00 00 00 00 0b 00 02  00 00 00 00 02 00 03 00 ........
> ........
> > 000000F1  08 00 04 00 00 00 64 00  00 08 00 05 00 00 00 01 ......d.
> ........
> > 00000101  00                                               .
> >
> >    00000019  80 01 00 02 00 00 00 09  67 65 74 5f 73 6c 69 63 ........
> get_slic
> >    00000029  65 00 00 00 00 0f 00 00  0c 00 00 00 00 00       e.......
> ......
> >
> >
> ----------------------------------------------------------------------------
> >
> > In [6]: cf.insert('www.example.com.', {'foo': {'bar' : 'baz'}})
> > Out[6]: 1267319533
> >
> > 00000102  80 01 00 01 00 00 00 0c  62 61 74 63 68 5f 69 6e ........
> batch_in
> > 00000112  73 65 72 74 00 00 00 00  0b 00 01 00 00 00 03 44 sert....
> .......D
> > 00000122  4e 53 0b 00 02 00 00 00  10 77 77 77 2e 65 78 61 NS......
> .www.exa
> > 00000132  6d 70 6c 65 2e 63 6f 6d  2e 0d 00 03 0b 0f 00 00 mple.com........
> > 00000142  00 01 00 00 00 06 52 52  73 65 74 73 0c 00 00 00 ......RR
> sets....
> > 00000152  01 0c 00 02 0b 00 01 00  00 00 03 66 6f 6f 0f 00 ........
> ...foo..
> > 00000162  02 0c 00 00 00 01 0b 00  01 00 00 00 03 62 61 72 ........
> .....bar
> > 00000172  0b 00 02 00 00 00 03 62  61 7a 0a 00 03 00 00 00 .......b
> az......
> > 00000182  00 4b 89 c2 ed 00 00 00  08 00 04 00 00 00 00 00 .K......
> ........
> >
> >    00000037  80 01 00 02 00 00 00 0c  62 61 74 63 68 5f 69 6e ........
> batch_in
> >    00000047  73 65 72 74 00 00 00 00  00                      sert.... .
> >
> >
> ----------------------------------------------------------------------------
> >
> > In [7]: cf.get('www.example.com.')
> > Out[7]: {'foo': {'bar': 'baz'}}
> >
> > 00000192  80 01 00 01 00 00 00 09  67 65 74 5f 73 6c 69 63 ........
> get_slic
> > 000001A2  65 00 00 00 00 0b 00 01  00 00 00 03 44 4e 53 0b e.......
> ....DNS.
> > 000001B2  00 02 00 00 00 10 77 77  77 2e 65 78 61 6d 70 6c ......ww
> w.exampl
> > 000001C2  65 2e 63 6f 6d 2e 0c 00  03 0b 00 03 00 00 00 06 e.com...
> ........
> > 000001D2  52 52 73 65 74 73 00 0c  00 04 0c 00 02 0b 00 01 RRsets..
> ........
> > 000001E2  00 00 00 00 0b 00 02 00  00 00 00 02 00 03 00 08 ........
> ........
> > 000001F2  00 04 00 00 00 64 00 00  08 00 05 00 00 00 01 00 .....d..
> ........
> >
> >    00000050  80 01 00 02 00 00 00 09  67 65 74 5f 73 6c 69 63 ........
> get_slic
> >    00000060  65 00 00 00 00 0f 00 00  0c 00 00 00 01 0c 00 02 e.......
> ........
> >    00000070  0b 00 01 00 00 00 03 66  6f 6f 0f 00 02 0c 00 00 .......f
> oo......
> >    00000080  00 01 0b 00 01 00 00 00  03 62 61 72 0b 00 02 00 ........
> .bar....
> >    00000090  00 00 03 62 61 7a 0a 00  03 00 00 00 00 4b 89 c2 ...baz..
> .....K..
> >    000000A0  ed 00 00 00 00
> >
> >
> ----------------------------------------------------------------------------
> >
> > --
> > Robert Edmonds
> > edmonds@debian.org
> >
> >
>

Mime
View raw message