Return-Path: X-Original-To: apmail-cassandra-dev-archive@www.apache.org Delivered-To: apmail-cassandra-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4D76B9F36 for ; Fri, 30 Mar 2012 17:04:40 +0000 (UTC) Received: (qmail 97484 invoked by uid 500); 30 Mar 2012 17:04:39 -0000 Delivered-To: apmail-cassandra-dev-archive@cassandra.apache.org Received: (qmail 97462 invoked by uid 500); 30 Mar 2012 17:04:39 -0000 Mailing-List: contact dev-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list dev@cassandra.apache.org Received: (qmail 97454 invoked by uid 99); 30 Mar 2012 17:04:39 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Mar 2012 17:04:39 +0000 X-ASF-Spam-Status: No, hits=-0.5 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of boneill42@gmail.com designates 209.85.216.172 as permitted sender) Received: from [209.85.216.172] (HELO mail-qc0-f172.google.com) (209.85.216.172) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Mar 2012 17:04:34 +0000 Received: by qcsq13 with SMTP id q13so589939qcs.31 for ; Fri, 30 Mar 2012 10:04:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:user-agent:date:subject:from:to:message-id:thread-topic :in-reply-to:mime-version:content-type:content-transfer-encoding; bh=InYV6qwiP/cnAo+Z4/g9Vg00S1pD8M+EGcLeK2VTaD0=; b=DGiYlkoIxh3wzf64CLPTw7UZ3WFGzA6+itIYfvUaqITzo5CWTAXIz0nW9ICRRH5XIL /P5ii/fFeCuCEIrv2Zm5gF8eEdBQ852Biha+wzZA8CdBGuEOW6DJVs8GiEu3QQwfZYhH +a0mjYgAvCxLC8Jw+cngrHMRNZoTiByZXNYbDetnuOoPeN3BRvJaxrvSohFbWKRrxzO1 LYzscOeYzSufcBlve3kCZvwcw4T1yR3BwooJgSFgVuj/gwocWawzAMUXtbAPNoJMhLBs PGQ/Qa19ShgyRKYa+Odw7gad371uVG1McaIDTn1gBsaUuAzowHChq6mI+WGhYFtGKhcU RLIw== Received: by 10.229.77.17 with SMTP id e17mr1189027qck.10.1333127054075; Fri, 30 Mar 2012 10:04:14 -0700 (PDT) Received: from [10.60.70.220] ([67.132.206.254]) by mx.google.com with ESMTPS id ef6sm19225016qab.7.2012.03.30.10.04.12 (version=SSLv3 cipher=OTHER); Fri, 30 Mar 2012 10:04:12 -0700 (PDT) Sender: "Brian O'Neill" User-Agent: Microsoft-MacOutlook/14.14.0.111121 Date: Fri, 30 Mar 2012 13:04:08 -0400 Subject: Re: Document storage From: Brian O'Neill To: Message-ID: Thread-Topic: Document storage In-Reply-To: <89A95CBC-F23B-45BD-A512-D7F4DB52E735@gmx.net> Mime-version: 1.0 Content-type: text/plain; charset="US-ASCII" Content-transfer-encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Do we also need to consider the client API? If we don't adjust thrift, the client just gets bytes right? The client is on their own to marshal back into a structure. In this case, it seems like we would want to chose a standard that is efficient and for which there are common libraries. Protobuf seems to fit the bill here. Or do we pass back some other structure? (Native lists/maps? JSON strings?) Do we ignore sorting/comparators? (similar to SOLR, I'm not sure people have defined a good sort for multi-valued items) -brian ---- Brian O'Neill Lead Architect, Software Development Health Market Science | 2700 Horizon Drive | King of Prussia, PA 19406 p: 215.588.6024blog: http://weblogs.java.net/blog/boneill42/ blog: http://brianoneill.blogspot.com/ On 3/30/12 12:01 PM, "Daniel Doubleday" wrote: >> Just telling C* to store a byte[] *will* be slightly lighter-weight >> than giving it named columns, but we're talking negligible compared to >> the overhead of actually moving the data on or off disk in the first >> place. >Hm - but isn't this exactly the point? You don't want to move data off >disk. >But decomposing into columns will lead to more of that: > >- Total amount of serialized data is (in most cases a lot) larger than >protobuffed / compressed version >- If you do selective updates the document will be scattered over >multiple ssts plus if you do sliced reads you can't optimize reads as >opposed to the single column version that when updated is automatically >superseding older versions so most reads will hit only one sst > >All these reads make the hot dataset. If it fits the page cache your >fine. If it doesn't you need to buy more iron. > >Really could not resist because your statement seems to be contrary to >all our tests / learnings. > >Cheers, >Daniel > >From dev list: > >Re: Document storage >On Thu, Mar 29, 2012 at 1:11 PM, Drew Kutcharian wrote: >>> I think this is a much better approach because that gives you the >>> ability to update or retrieve just parts of objects efficiently, >>> rather than making column values just blobs with a bunch of special >>> case logic to introspect them. Which feels like a big step backwards >>> to me. >> >> Unless your access pattern involves reading/writing the whole document >>each time. In >that case you're better off serializing the whole document and storing it >in a column as a >byte[] without incurring the overhead of column indexes. Right? > >Hmm, not sure what you're thinking of there. > >If you mean the "index" that's part of the row header for random >access within a row, then no, serializing to byte[] doesn't save you >anything. > >If you mean secondary indexes, don't declare any if you don't want any. :) > >Just telling C* to store a byte[] *will* be slightly lighter-weight >than giving it named columns, but we're talking negligible compared to >the overhead of actually moving the data on or off disk in the first >place. Not even close to being worth giving up being able to deal >with your data from standard tools like cqlsh, IMO. > >-- >Jonathan Ellis >Project Chair, Apache Cassandra >co-founder of DataStax, the source for professional Cassandra support >http://www.datastax.com >