avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Busbey (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (AVRO-1312) Make use of the sun.misc.Unsafe class in the IO streams implementation if a JDK supports it
Date Fri, 18 Aug 2017 13:05:00 GMT

     [ https://issues.apache.org/jira/browse/AVRO-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Sean Busbey updated AVRO-1312:
    Issue Type: Improvement  (was: Bug)

> Make use of the sun.misc.Unsafe class in the IO streams implementation if a JDK supports
> -------------------------------------------------------------------------------------------
>                 Key: AVRO-1312
>                 URL: https://issues.apache.org/jira/browse/AVRO-1312
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>            Reporter: Leo Romanoff
>            Priority: Minor
> This is a follow-up of the AVRO-1282 issue.
> AVRO-1282 has used Unsafe to significantly improve performance of Reflection-based serialization.
But Unsafe can be also used to improve performance of IO streams, which would be beneficial
not only for Reflection-based, but for all kinds of serializers. Experience with Kryo shows
that it can boost performance even higher that the speedups provided by AVRO-1282. 
> Pros:
> - Overall performance boost
> - Biggest speedups of this optimizations are expected for the arrays of primitive types,
as they can be very efficiently written using bulk operations instead of writing their elements
one by one.
> - It is possible to write directly into the off-heap memory buffers at the native speed,
without using intermediate byte arrays. This can be interesting for Big Data apps, which often
keep a lot of data off-heap
> Cons:
> - Unsafe can efficiently write only primitive types in their native byte order and using
their fixed size. This is not quite compatible with Avro's format. 
> (While one can still use Unsafe to more efficiently write single elements even using
their variable length encoding, the biggest benefits of bulk array serialization would be
> - Introducing this feature may require a definition of a new format for Avro. This format
would be very fast, but not very space efficient as it would use fixed-size representation.
> BTW, initial tests where a sketch of the proposed optimization is applied only to Floats
and Doubles has shown immediate boost of 35%. 

This message was sent by Atlassian JIRA

View raw message