arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Micah Kornfield <>
Subject Re: Blogpost on Arrow's binary format & memory mapping
Date Fri, 14 Aug 2020 05:43:11 GMT
I'd also add that your point:

There are certainly other situations such as small files where the copying
> pathway is indeed faster, but for these pathways is it not even close.

This is pretty much the intended design of the java library.  Not small
file per-se but  small batches streamed through processing pipelines.

On Thu, Aug 13, 2020 at 7:59 PM Micah Kornfield <>

> Hi Chris,
> Nice write-up.  I'm curious if you did more analysis on where time was
> spent for each method?
> It seems to confirm that investing in zero copy read from disk provides a
> nice speedup.  I'm curious did you aren't too create a buffer allocator
> based on memory mapper files for comparison?
> Thanks,
> Micah
> On Thursday, August 13, 2020, Chris Nuernberger <>
> wrote:
>> Arrow Users -
>> We took some time and wrote a blogpost on arrow's binary format and
>> memory mapping on the JVM.  We are happy with how succinctly we broke down
>> the binary format in a visual way and think Arrow users looking to do
>> interesting/unsupported things with Arrow may be interested in the
>> presentation.
>> Chris

View raw message