avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stu Hood (JIRA)" <j...@apache.org>
Subject [jira] Commented: (AVRO-679) Improved encodings for arrays
Date Mon, 07 Feb 2011 07:28:30 GMT

    [ https://issues.apache.org/jira/browse/AVRO-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991293#comment-12991293
] 

Stu Hood commented on AVRO-679:
-------------------------------

> Unfortunately, ByteBuffer is polymorphic and thus 'put' is a virtual method.
I had not thought of the polymorphism in ByteBuffer: thanks for the tip! Since groups are
of a known length, copying to and from a temporary byte[] of the max group length would probably
work swimmingly.

> Improved encodings for arrays
> -----------------------------
>
>                 Key: AVRO-679
>                 URL: https://issues.apache.org/jira/browse/AVRO-679
>             Project: Avro
>          Issue Type: New Feature
>          Components: spec
>            Reporter: Stu Hood
>            Priority: Minor
>
> There are better ways to encode arrays of varints [1] which are faster to decode, and
more space efficient than encoding varints independently.
> Extending the idea to other types of variable length data like 'bytes' and 'string',
you could encode the entries for an array block as an array of lengths, followed by contiguous
byte/utf8 data.
> [1] group varint encoding: slides 57-63 of http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/people/jeff/WSDM09-keynote.pdf

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message