incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben McCann <...@benmccann.com>
Subject Re: Document storage
Date Thu, 29 Mar 2012 04:28:38 GMT
I don't imagine sort is a meaningful operation on JSON data.  As long as
the sorting is consistent I would think that should be sufficient.


On Wed, Mar 28, 2012 at 8:51 PM, Edward Capriolo <edlinuxguru@gmail.com>wrote:

> Some work I did stores JSON blobs in columns. The question on JSON
> type is how to sort it.
>
> On Wed, Mar 28, 2012 at 7:35 PM, Jeremy Hanna
> <jeremy.hanna1234@gmail.com> wrote:
> > I don't speak for the project, but you might give it a day or two for
> people to respond and/or perhaps create a jira ticket.  Seems like that's a
> reasonable data type that would get some traction - a json type.  However,
> what would validation look like?  That's one of the main reasons there are
> the data types and validators, in order to validate on insert.
> >
> > On Mar 29, 2012, at 12:27 AM, Ben McCann wrote:
> >
> >> Any thoughts?  I'd like to submit a patch, but only if it will be
> accepted.
> >>
> >> Thanks,
> >> Ben
> >>
> >>
> >> On Wed, Mar 28, 2012 at 8:58 AM, Ben McCann <ben@benmccann.com> wrote:
> >>
> >>> Hi,
> >>>
> >>> I was wondering if it would be interesting to add some type of
> >>> document-oriented data type.
> >>>
> >>> I've found it somewhat awkward to store document-oriented data in
> >>> Cassandra today.  I can make a JSON/Protobuf/Thrift, serialize it, and
> >>> store it, but Cassandra cannot differentiate it from any other string
> or
> >>> byte array.  However, if my column validation_class could be a JsonType
> >>> that would allow tools to potentially do more interesting
> introspection on
> >>> the column value.  E.g. bug 3647<
> https://issues.apache.org/jira/browse/CASSANDRA-3647>calls for supporting
> arbitrarily nested "documents" in CQL.  Running a
> >>> query against the JSON column in Pig is possible as well, but again in
> this
> >>> use case it would be helpful to be able to encode in column metadata
> that
> >>> the column is stored as JSON.  For debugging, running nightly reports,
> etc.
> >>> it would be quite useful compared to the opaque string and byte array
> types
> >>> we have today.  JSON is appealing because it would be easy to
> implement.
> >>> Something like Thrift or Protocol Buffers would actually be interesting
> >>> since they would be more space efficient.  However, they would also be
> a
> >>> bit more difficult to implement because of the extra typing information
> >>> they provide.  I'm hoping with Cassandra 1.0's addition of compression
> that
> >>> storing JSON is not too inefficient.
> >>>
> >>> Would there be interest in adding a JsonType?  I could look at putting
> a
> >>> patch together.
> >>>
> >>> Thanks,
> >>> Ben
> >>>
> >>>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message