Return-Path: X-Original-To: apmail-avro-user-archive@www.apache.org Delivered-To: apmail-avro-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7E9FBF687 for ; Thu, 21 Mar 2013 18:26:45 +0000 (UTC) Received: (qmail 50352 invoked by uid 500); 21 Mar 2013 18:26:45 -0000 Delivered-To: apmail-avro-user-archive@avro.apache.org Received: (qmail 50256 invoked by uid 500); 21 Mar 2013 18:26:45 -0000 Mailing-List: contact user-help@avro.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@avro.apache.org Delivered-To: mailing list user@avro.apache.org Received: (qmail 50248 invoked by uid 99); 21 Mar 2013 18:26:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Mar 2013 18:26:45 +0000 X-ASF-Spam-Status: No, hits=2.3 required=5.0 tests=SPF_SOFTFAIL,URI_HEX X-Spam-Check-By: apache.org Received-SPF: softfail (nike.apache.org: transitioning domain of sam@mefford.org does not designate 216.139.236.26 as permitted sender) Received: from [216.139.236.26] (HELO sam.nabble.com) (216.139.236.26) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Mar 2013 18:26:39 +0000 Received: from ben.nabble.com ([192.168.236.152]) by sam.nabble.com with esmtp (Exim 4.72) (envelope-from ) id 1UIkBu-0004tz-M4 for user@avro.apache.org; Thu, 21 Mar 2013 11:26:18 -0700 Date: Thu, 21 Mar 2013 11:26:18 -0700 (PDT) From: sammefford To: user@avro.apache.org Message-ID: <1363890378675-4026663.post@n3.nabble.com> Subject: Where are the rows in Trevni format? MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org I read the Trevni Specificaiton: http://avro.apache.org/docs/1.7.4/trevni/spec.html and I can't see where the row ids are stored for each value in each column. Am I missing something obvious? Is the spec incomplete on that point? Also, to confirm, my understanding is columnar formats are efficient because they store column values sorted and can thereby find specific values or ranges of values quickly. While the spec mentions the benefits of sorting, I don't see a requirement that column values be sorted. Can we depend that the blocks of column values are sorted? Thanks, Sam Mefford Chief Architect-Big Data Solutions Avalon Consluting, LLC. 801-706-9731 -- View this message in context: http://apache-avro.679487.n3.nabble.com/Where-are-the-rows-in-Trevni-format-tp4026663.html Sent from the Avro - Users mailing list archive at Nabble.com.