Return-Path: Delivered-To: apmail-hadoop-avro-user-archive@minotaur.apache.org Received: (qmail 85551 invoked from network); 12 Mar 2010 18:57:50 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 12 Mar 2010 18:57:50 -0000 Received: (qmail 80114 invoked by uid 500); 12 Mar 2010 18:57:12 -0000 Delivered-To: apmail-hadoop-avro-user-archive@hadoop.apache.org Received: (qmail 80087 invoked by uid 500); 12 Mar 2010 18:57:12 -0000 Mailing-List: contact avro-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: avro-user@hadoop.apache.org Delivered-To: mailing list avro-user@hadoop.apache.org Received: (qmail 80079 invoked by uid 99); 12 Mar 2010 18:57:12 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Mar 2010 18:57:12 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [140.211.11.9] (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with SMTP; Fri, 12 Mar 2010 18:57:11 +0000 Received: (qmail 85480 invoked by uid 99); 12 Mar 2010 18:57:28 -0000 Received: from localhost.apache.org (HELO [192.168.168.108]) (127.0.0.1) (smtp-auth username cutting, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Mar 2010 18:57:28 +0000 Message-ID: <4B9A8E72.5060600@apache.org> Date: Fri, 12 Mar 2010 10:56:50 -0800 From: Doug Cutting User-Agent: Thunderbird 2.0.0.23 (X11/20090817) MIME-Version: 1.0 To: avro-user@hadoop.apache.org Subject: Re: file format stable? References: <92c4d8c11003120344m10f428d9ja466c63eef185534@mail.gmail.com> <4B9A7A44.3010500@apache.org> <92c4d8c11003120935m43765959i7e6aa3b6371ff563@mail.gmail.com> <09F0FA5B-51B1-43A2-B8F3-F4C96769D401@richrelevance.com> In-Reply-To: <09F0FA5B-51B1-43A2-B8F3-F4C96769D401@richrelevance.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Scott Carey wrote: > * Large records -- the final block size has to be known before writing, currently this is done by buffering in memory while writing. A back-compatible format extension that would help here is, if the block size is negative, then its absolute value is the size of the next chunk in the block. In other words, a block would continue until a chunk is found whose length is positive. Doug