incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tatu Saloranta <tsalora...@gmail.com>
Subject Re: Document storage
Date Thu, 29 Mar 2012 02:54:13 GMT
On Wed, Mar 28, 2012 at 6:59 PM, Jeremiah Jordan
<JEREMIAH.JORDAN@morningstar.com> wrote:
> Sounds interesting to me.  I looked into adding protocol buffer support at one point,
and it didn't look like it would be too much work.  The tricky part was I also wanted to
add indexing support for attributes of the inserted protocol buffers.  That looked a little
trickier, but still not impossible.  Though other stuff came up and I never got around to
actually writing any code.
> JSON support would be nice, especially if you figured out how to get built in indexing
of the attributes inside the JSON to work =).

Also, for whatever it's worth, it should be trivial to add support for
Smile (binary JSON serialization):
http://wiki.fasterxml.com/SmileFormatSpec
since its logical data structure is pure JSON, no extensions or
subsetting. The main Java impl is by Jackson project, but there is
also a C codec (https://github.com/pierre/libsmile), and prototypes
for PHP and Ruby bindings as well.
But for all data it's bit faster, bit more compact; about 30% for
individual items, but more (40 - 70%) for data sequences (due to
optional back-referencing).

JSON and Smile can be auto-detected from first 4 bytes or so, reliably
and efficiently, so one should be able to add this either
transparently or explicitly.
One could even transcode things on the fly -- store as Smile, expose
filtered results as JSON (and accept JSON or both). This could reduce
storage cost while keep the benefits of flexible data format.

-+ Tatu +-

Mime
View raw message