arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacques Nadeau <jacq...@apache.org>
Subject Re: [JAVA] Figuring out whats shifted from Drill/Java
Date Thu, 16 Jun 2016 15:59:06 GMT
> Netty buffer always allocate memory aligned to 64-bytes. So each new
> ArrowBuf will be aligned to 64-bytes as well, with offset = 0.
>

You confirmed that both the Netty chunk as well as buffer allocations
(ArrowBufs returned from here [1]) are on 64-byte offsets? Can you maybe
write some tests/add some assertion to the code so we protect against that
changing?


>
> ​I don't fully understand why new allocations should be on 64-bytes
> offset?​
>

As part of the Arrow spec, each separate piece of memory must have 64
byte-sized-word alignment and 64 byte padding. For example, if you have
NullableVarChar, you'll need three buffers: nullable bits, four byte
offsets and data buffer. Each of those must be on a 64 byte offset and be a
length that is a multiple of 64 bytes.

[1]
https://github.com/apache/arrow/blob/master/java/memory/src/main/java/org/apache/arrow/memory/BufferAllocator.java#L37


>
> ​-Kiril​
>
>
>
> On Jun 14, 2016, at 00:22, Jacques Nadeau <jacques@apache.org> wrote:
>
> Yes, I think there are two main components. Also, I accidentally said 64
> bits when I should have said 64 bytes.
>
> 1. New allocations should be on 64 byte offsets
> 2. Serializing existing vectors must be done such that they are always in
> an increment of 64 bytes. This is necessary to avoid copying when sending
> across the wire, otherwise the receiver would need to slice up/copy the
> incoming datastream. This would be done by ensuring that the setValueCount
> and similar operations (capcity) are done at the right range. I'd expect
> this second one to be best done on top of Steven's work.
>
>
>
>
>
> On Mon, Jun 13, 2016 at 2:14 PM, Kiril Menshikov <kmenshikov@gmail.com>
> wrote:
>
> Hi,
>
> Does this mean that offset must be adjusted depending on the UDLE memory.
> So new memory address will be align to 64 bits?
>
>
> The first thing we should do for the alignment in Java is adjust the
> allocator so that it always allocates on a 64 bit offset. Does someone
>
> want
>
> to look at that?
>
>
>
> Thanks,
> -Kiril
>
> On Jun 11, 2016, at 22:45, Jacques Nadeau <jacques@apache.org> wrote:
>
> Steven is on vacation for a couple of days. His focus as I understand it
>
> is
>
> rationalizing the code so it is cleaner, correct for arrow versus drill
> representation differences (such as decimal, nulls, etc) and has more
>
> unit
>
> tests. Once he gets back in the next day or two, hopefully he can post a
> wip patch.
>
> The first thing we should do for the alignment in Java is adjust the
> allocator so that it always allocates on a 64 bit offset. Does someone
>
> want
>
> to look at that?
> On Jun 10, 2016 5:35 PM, "Gaurav Agarwal" <gaurav130403@gmail.com>
>
> wrote:
>
>
> I am also interested on this . Do we need to know drill before start
> implementing not a for arrow .
> On Jun 10, 2016 9:45 PM, "Wail Alkowaileet" <wael.y.k@gmail.com> wrote:
>
> On Wed, Jun 8, 2016 at 9:26 PM, Micah Kornfield <emkornfield@gmail.com
>
>
> wrote:
>
> Hi Steven,
> Is the patch focused on the alignment/padding.  Or are there other
> issues as well?
>
>
> I'm interested on this as well....
>
>
> Thanks,
> Micah
>
> On Tue, Jun 7, 2016 at 11:22 PM, Steven Phillips <steven@dremio.com>
> wrote:
>
> I am currently working on a patch that addresses this, as well as
>
> removing
>
> some of the residual code from Drill that isn't really needed in
>
> Arrow,
>
> (such as the Drill types, MaterializedField, etc.)
>
> I will be posting this within a few days.
>
> On Tue, Jun 7, 2016 at 5:54 PM, Leif Walsh <leif.walsh@gmail.com>
>
> wrote:
>
>
> I am also interested in this.
> On Tue, Jun 7, 2016 at 17:37 Holden Karau <holden@pigscanfly.ca>
>
> wrote:
>
>
> Hi Everyone,
>
> I'm looking to help get started with Arrow & Spark and to that end
>
> I'd
>
> like
>
> to start with getting the Java implementation closer to the spec
>
> / C
>
> implementation. I'm wondering what places people know the
>
> differences
>
> are
>
> between the two?
>
> Cheers,
>
> Holden :)
>
> --
> --
> Cheers,
> Leif
>
>
>
>
>
> --
>
> *Regards,*
> Wail Alkowaileet
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message