Return-Path: X-Original-To: apmail-cassandra-dev-archive@www.apache.org Delivered-To: apmail-cassandra-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B52439120 for ; Wed, 28 Mar 2012 23:27:36 +0000 (UTC) Received: (qmail 40892 invoked by uid 500); 28 Mar 2012 23:27:35 -0000 Delivered-To: apmail-cassandra-dev-archive@cassandra.apache.org Received: (qmail 40859 invoked by uid 500); 28 Mar 2012 23:27:35 -0000 Mailing-List: contact dev-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list dev@cassandra.apache.org Received: (qmail 40850 invoked by uid 99); 28 Mar 2012 23:27:35 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Mar 2012 23:27:35 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of benjamin.j.mccann@gmail.com designates 209.85.161.172 as permitted sender) Received: from [209.85.161.172] (HELO mail-gx0-f172.google.com) (209.85.161.172) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Mar 2012 23:27:30 +0000 Received: by ggmi1 with SMTP id i1so1273772ggm.31 for ; Wed, 28 Mar 2012 16:27:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type; bh=9YwEUm4qedwR37iz+3wlIw9xQmkRffkl8hihYXQhtcY=; b=bh0C5wHMl4dXk7HRbjFMC4OazwDdZcSTvIZ4mwH/r3Um/4KAEy3emxt+UewJUZHXUP wC3LUBrkUPoOCDm+LqkEtLzqGXdeCx8ypqdnEJNv6vAze0N1W9VtViP/iGcGuasoepYa MnrDBZQRvzsexxPVPwC1HLZi12j2Z+S4Axp9gGOzwTWq3slT3ZJM/X2fyd68cW2P78Xr OCFVjAr+aXwZ9M8LZwiSTArafTgjCXPeQBrusalRsDD6WwEcn8Ois7t3qRnGoZhjHgH9 e0REGjrvOVjw75K7r1tLPyt39/JaKfI7wujYpriLVzP3MD8oBw6sWJBRRNYHTuwjEWOk YbLw== MIME-Version: 1.0 Received: by 10.60.13.37 with SMTP id e5mr31976284oec.70.1332977229648; Wed, 28 Mar 2012 16:27:09 -0700 (PDT) Sender: benjamin.j.mccann@gmail.com Received: by 10.182.72.72 with HTTP; Wed, 28 Mar 2012 16:27:09 -0700 (PDT) In-Reply-To: References: Date: Wed, 28 Mar 2012 16:27:09 -0700 X-Google-Sender-Auth: L5J6tvQOh1boC4YGBs9J5fieJPU Message-ID: Subject: Re: Document storage From: Ben McCann To: dev@cassandra.apache.org Content-Type: multipart/alternative; boundary=e89a8fb1f35044510b04bc55f24c X-Virus-Checked: Checked by ClamAV on apache.org --e89a8fb1f35044510b04bc55f24c Content-Type: text/plain; charset=ISO-8859-1 Any thoughts? I'd like to submit a patch, but only if it will be accepted. Thanks, Ben On Wed, Mar 28, 2012 at 8:58 AM, Ben McCann wrote: > Hi, > > I was wondering if it would be interesting to add some type of > document-oriented data type. > > I've found it somewhat awkward to store document-oriented data in > Cassandra today. I can make a JSON/Protobuf/Thrift, serialize it, and > store it, but Cassandra cannot differentiate it from any other string or > byte array. However, if my column validation_class could be a JsonType > that would allow tools to potentially do more interesting introspection on > the column value. E.g. bug 3647calls for supporting arbitrarily nested "documents" in CQL. Running a > query against the JSON column in Pig is possible as well, but again in this > use case it would be helpful to be able to encode in column metadata that > the column is stored as JSON. For debugging, running nightly reports, etc. > it would be quite useful compared to the opaque string and byte array types > we have today. JSON is appealing because it would be easy to implement. > Something like Thrift or Protocol Buffers would actually be interesting > since they would be more space efficient. However, they would also be a > bit more difficult to implement because of the extra typing information > they provide. I'm hoping with Cassandra 1.0's addition of compression that > storing JSON is not too inefficient. > > Would there be interest in adding a JsonType? I could look at putting a > patch together. > > Thanks, > Ben > > --e89a8fb1f35044510b04bc55f24c--