incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben McCann <...@benmccann.com>
Subject Document storage
Date Wed, 28 Mar 2012 15:58:43 GMT
Hi,

I was wondering if it would be interesting to add some type of
document-oriented data type.

I've found it somewhat awkward to store document-oriented data in Cassandra
today.  I can make a JSON/Protobuf/Thrift, serialize it, and store it, but
Cassandra cannot differentiate it from any other string or byte array.
 However, if my column validation_class could be a JsonType that would
allow tools to potentially do more interesting introspection on the column
value.  E.g. bug 3647
<https://issues.apache.org/jira/browse/CASSANDRA-3647>calls for
supporting arbitrarily nested "documents" in CQL.  Running a
query against the JSON column in Pig is possible as well, but again in this
use case it would be helpful to be able to encode in column metadata that
the column is stored as JSON.  For debugging, running nightly reports, etc.
it would be quite useful compared to the opaque string and byte array types
we have today.  JSON is appealing because it would be easy to implement.
 Something like Thrift or Protocol Buffers would actually be interesting
since they would be more space efficient.  However, they would also be a
bit more difficult to implement because of the extra typing information
they provide.  I'm hoping with Cassandra 1.0's addition of compression that
storing JSON is not too inefficient.

Would there be interest in adding a JsonType?  I could look at putting a
patch together.

Thanks,
Ben

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message