avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Carey (JIRA)" <j...@apache.org>
Subject [jira] Commented: (AVRO-637) GenericArray should implement Collection
Date Sat, 28 Aug 2010 01:22:54 GMT

    [ https://issues.apache.org/jira/browse/AVRO-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903752#action_12903752

Scott Carey commented on AVRO-637:

bq. I don't follow this logic. If they pass a List, then its order will be preserved. But
if someone passes in an unordered data structure, why should they expect order to be preserved?

Producers and Consumers are sometimes decoupled and really different users.  User A wants
to create a list from a Set and persist that later, user B intercepts the serialization process
and tries to append something to the list.  
The Avro contract for an array type preserves order.  So, for example one helper method might
alter a type and assume that the data is ordered and appendable (it was in 1.3, and comes
out of deserialization that way), then a different user applies this helper method to an object
constructed some other way, and the result is a surprise.
The raw type is Collection, which does not guarantee this so such a user would be mistaken,
but the semantics are more complicated and error prone.

If we are going as far as Collection and abandoning the idea that the data in memory is consistently
ordered, sortable,  and appendable we should probably consider going one step further up the
interface inheritance tree, past Collection to the top: Iterable.  Then you could serialize
arrays that don't fit in memory too.

bq. 1.4 already makes changes to generated APIs (Utf8 -> CharSequence)

Good point, another change in 1.4 along these lines doesn't add much incremental work to upgrade.

> GenericArray should implement Collection
> ----------------------------------------
>                 Key: AVRO-637
>                 URL: https://issues.apache.org/jira/browse/AVRO-637
>             Project: Avro
>          Issue Type: New Feature
>          Components: java
>            Reporter: Doug Cutting
>            Assignee: Doug Cutting
>             Fix For: 1.5.0
>         Attachments: AVRO-637.patch, AVRO-637.patch
> It would be nice if Avro arrays were better integrated with Java collections.  The GenericArray
interface permits array element reuse, which is awkward with java.util.Collection.  But if
GenericArray implemented Collection and the Avro runtime permitted arbitrary Collection implementations
to be passed for Arrays then it would simplify many applications.  The runtime could still
reuse elements if an array implemented GenericArray, so performance would not suffer for applications
that, e.g., loop over a data file, reusing instances.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message