Return-Path: Delivered-To: apmail-avro-user-archive@www.apache.org Received: (qmail 69534 invoked from network); 16 Aug 2010 21:47:33 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 16 Aug 2010 21:47:33 -0000 Received: (qmail 39770 invoked by uid 500); 16 Aug 2010 21:47:33 -0000 Delivered-To: apmail-avro-user-archive@avro.apache.org Received: (qmail 39694 invoked by uid 500); 16 Aug 2010 21:47:32 -0000 Mailing-List: contact user-help@avro.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@avro.apache.org Delivered-To: mailing list user@avro.apache.org Received: (qmail 39686 invoked by uid 99); 16 Aug 2010 21:47:32 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Aug 2010 21:47:32 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of plinehan@gmail.com designates 209.85.212.43 as permitted sender) Received: from [209.85.212.43] (HELO mail-vw0-f43.google.com) (209.85.212.43) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Aug 2010 21:47:26 +0000 Received: by vws8 with SMTP id 8so3835545vws.30 for ; Mon, 16 Aug 2010 14:47:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:sender:received :in-reply-to:references:from:date:x-google-sender-auth:message-id :subject:to:content-type; bh=4KC84LR+kZtabD8jr5lgY3Bzh7KIdIaLl2NYF/pnpKI=; b=AI3bN+StUO54lnrA+pj/GDup+9nQtPlX9bNWzUwfvnzQiCXCiu4dWRPLZAqtt23SZW lPQN4GLuvzpEhae6AA0tCpfrE8WgjlbP8ExoPqn9XQau9wIE9Z/XYwB2V40lK33kQrnG UUUNiXez9QlQRUf4JIfKsKHCBPQbnReUN6klA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:content-type; b=M0c6a0NG0rVWbGEjXlEbC0bg+WLhPo/GX8Oj+0q0FKCThPBadFdac85yXraGurBk+9 8xLOXQ8FnYhkRWjdabAJ1htICb8j1lif5sgvD1vC+EXF+xWZppAPPFNdpQIuewM2Fkvk qZT90De3QpSdAXUpNLTFElORFt1kiSkn4dv4A= Received: by 10.220.60.70 with SMTP id o6mr3526600vch.142.1281995225194; Mon, 16 Aug 2010 14:47:05 -0700 (PDT) MIME-Version: 1.0 Sender: plinehan@gmail.com Received: by 10.220.165.134 with HTTP; Mon, 16 Aug 2010 14:46:45 -0700 (PDT) In-Reply-To: <4C50568A.4060805@apache.org> References: <4C4F42B7.7010800@darose.net> <4C4F47B9.8090606@apache.org> <4C4F5A87.8020303@yahoo-inc.com> <4C4F5F9C.6050109@apache.org> <4C50568A.4060805@apache.org> From: Patrick Linehan Date: Mon, 16 Aug 2010 14:46:45 -0700 X-Google-Sender-Auth: ynVlEzw05ngwsQ2byjTBPv18bkQ Message-ID: Subject: Re: java specific implementation uses GenericArray ? To: user@avro.apache.org Content-Type: multipart/alternative; boundary=e0cb4e887f57006758048df7c6f2 --e0cb4e887f57006758048df7c6f2 Content-Type: text/plain; charset=ISO-8859-1 does anyone have any suggestions for dealing with large lists/arrays of primitive values in avro? in my case (numerical algorithms), my naive mapping of a vector type (mathematical vectors, not java Vectors) to an avro specific type generates a GenericArray. needless to say, i would prefer to avoid the cost of boxing up all the individual floating point numbers. is it possible to coerce avro into using raw java primitive arrays, e.g. "double[]"? On Wed, Jul 28, 2010 at 9:10 AM, Doug Cutting wrote: > On 07/28/2010 02:07 AM, Nick Palmer wrote: > >> It would be very nice if GenericArray implemented List. I need get, >> set, and remove in GenericData.Array for my application and have >> already added these to my Avro code so I can continue developing. I >> was planning to file a patch in JIRA for this change. >> > > This would be a great patch to have! > > > The trouble with making GenericArray implement List is that >> List.size() returns an int and GenericArray.size() returns a long. Is >> there a reason for this? >> > > Avro arrays can be arbitrarily long, written as blocks. The thinking was > that the interface should expose the length as a long, permitting > implementations that might page values from disk as you iterate. The > collision with List#size() is unfortunate. > > We could either: > a. unilaterally change GenericArray#size() to return int; or > b. rename GenericArray#size() to be something else, like arraySize() or > somesuch, so that someone could still implement a version that's paged. > > My instinct is towards (a). If/when someone ever implements a paged > representation for GenericArray they can perhaps add a method with the full > size then. > > Doug > --e0cb4e887f57006758048df7c6f2 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable does anyone have any suggestions for dealing with large lists/arrays of pri= mitive values in avro?

in my case (numerical algorithms)= , my naive mapping of a vector type (mathematical vectors, not java Vectors= ) to an avro specific type generates a GenericArray<Double>. =A0needl= ess to say, i would prefer to avoid the cost of boxing up all the individua= l floating point numbers.

is it possible to coerce avro into using raw java primi= tive arrays, e.g. "double[]"?

O= n Wed, Jul 28, 2010 at 9:10 AM, Doug Cutting <cutting@apache.org> wrote:
On 07/28/2010 02:07 AM, N= ick Palmer wrote:
It would be very nice if GenericArray implemented List. I need get,
set, and remove in GenericData.Array for my application and have
already added these to my Avro code so I can continue developing. I
was planning to file a patch in JIRA for this change.

This would be a great patch to have!


The trouble with making GenericArray implement List is that
List.size() returns an int and GenericArray.size() returns a long. Is
there a reason for this?

Avro arrays can be arbitrarily long, written as blocks. =A0The thinking was= that the interface should expose the length as a long, permitting implemen= tations that might page values from disk as you iterate. =A0The collision w= ith List#size() is unfortunate.

We could either:
=A0a. unilaterally change GenericArray#size() to return int; or
=A0b. rename GenericArray#size() to be something else, like arraySize() or = somesuch, so that someone could still implement a version that's paged.=

My instinct is towards (a). =A0If/when someone ever implements a paged repr= esentation for GenericArray they can perhaps add a method with the full siz= e then.

Doug

--e0cb4e887f57006758048df7c6f2--