avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "BELUGA BEHR (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (AVRO-2049) Remove Superfluous Configuration From AvroSerializer
Date Mon, 17 Jul 2017 01:07:00 GMT

     [ https://issues.apache.org/jira/browse/AVRO-2049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

BELUGA BEHR updated AVRO-2049:
------------------------------
    Attachment: AVRO-2049.1.patch

> Remove Superfluous Configuration From AvroSerializer
> ----------------------------------------------------
>
>                 Key: AVRO-2049
>                 URL: https://issues.apache.org/jira/browse/AVRO-2049
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.7.7, 1.8.2
>            Reporter: BELUGA BEHR
>            Priority: Trivial
>         Attachments: AVRO-2049.1.patch
>
>
> In the class {{org.apache.avro.hadoop.io.AvroSerializer}}, we see that the Avro block
size is configured with a hard-coded value and there is a request to benchmark different buffer
sizes.
> {code:title=org.apache.avro.hadoop.io.AvroSerializer}
>   /**
>    * The block size for the Avro encoder.
>    *
>    * This number was copied from the AvroSerialization of org.apache.avro.mapred in Avro
1.5.1.
>    *
>    * TODO(gwu): Do some benchmarking with different numbers here to see if it is important.
>    */
>   private static final int AVRO_ENCODER_BLOCK_SIZE_BYTES = 512;
>   /** An factory for creating Avro datum encoders. */
>   private static EncoderFactory mEncoderFactory
>       = new EncoderFactory().configureBlockSize(AVRO_ENCODER_BLOCK_SIZE_BYTES);
> {code}
> However, there is no need to benchmark, this setting is superfluous and is ignored with
the current implementation.
> {code:title=org.apache.avro.hadoop.io.AvroSerializer}
>   @Override
>   public void open(OutputStream outputStream) throws IOException {
>     mOutputStream = outputStream;
>     mAvroEncoder = mEncoderFactory.binaryEncoder(outputStream, mAvroEncoder);
>   }
> {code}
> {{org.apache.avro.io.EncoderFactory.binaryEncoder}} ignores this setting.  This setting
is only relevant for calls to {{org.apache.avro.io.EncoderFactory.blockingBinaryEncoder}}

>  which considers the configured "Block Size" for doing binary encoding of blocked Array
types as laid out in the [specs|https://avro.apache.org/docs/1.8.2/spec.html#binary_encode_complex].
 It can simply be removed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message