avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Linehan <pline...@plinehan.com>
Subject Re: java specific implementation uses GenericArray ?
Date Mon, 16 Aug 2010 21:46:45 GMT
does anyone have any suggestions for dealing with large lists/arrays of
primitive values in avro?

in my case (numerical algorithms), my naive mapping of a vector type
(mathematical vectors, not java Vectors) to an avro specific type generates
a GenericArray<Double>.  needless to say, i would prefer to avoid the cost
of boxing up all the individual floating point numbers.

is it possible to coerce avro into using raw java primitive arrays, e.g.
"double[]"?

On Wed, Jul 28, 2010 at 9:10 AM, Doug Cutting <cutting@apache.org> wrote:

> On 07/28/2010 02:07 AM, Nick Palmer wrote:
>
>> It would be very nice if GenericArray implemented List. I need get,
>> set, and remove in GenericData.Array for my application and have
>> already added these to my Avro code so I can continue developing. I
>> was planning to file a patch in JIRA for this change.
>>
>
> This would be a great patch to have!
>
>
>  The trouble with making GenericArray implement List is that
>> List.size() returns an int and GenericArray.size() returns a long. Is
>> there a reason for this?
>>
>
> Avro arrays can be arbitrarily long, written as blocks.  The thinking was
> that the interface should expose the length as a long, permitting
> implementations that might page values from disk as you iterate.  The
> collision with List#size() is unfortunate.
>
> We could either:
>  a. unilaterally change GenericArray#size() to return int; or
>  b. rename GenericArray#size() to be something else, like arraySize() or
> somesuch, so that someone could still implement a version that's paged.
>
> My instinct is towards (a).  If/when someone ever implements a paged
> representation for GenericArray they can perhaps add a method with the full
> size then.
>
> Doug
>

Mime
View raw message