arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Micah Kornfield <emkornfi...@gmail.com>
Subject Re: Blogpost on Arrow's binary format & memory mapping
Date Fri, 14 Aug 2020 05:43:11 GMT
I'd also add that your point:

There are certainly other situations such as small files where the copying
> pathway is indeed faster, but for these pathways is it not even close.

This is pretty much the intended design of the java library.  Not small
file per-se but  small batches streamed through processing pipelines.

On Thu, Aug 13, 2020 at 7:59 PM Micah Kornfield <emkornfield@gmail.com>
wrote:

> Hi Chris,
> Nice write-up.  I'm curious if you did more analysis on where time was
> spent for each method?
>
> It seems to confirm that investing in zero copy read from disk provides a
> nice speedup.  I'm curious did you aren't too create a buffer allocator
> based on memory mapper files for comparison?
>
> Thanks,
> Micah
>
> On Thursday, August 13, 2020, Chris Nuernberger <chris@techascent.com>
> wrote:
>
>> Arrow Users -
>>
>> We took some time and wrote a blogpost on arrow's binary format and
>> memory mapping on the JVM.  We are happy with how succinctly we broke down
>> the binary format in a visual way and think Arrow users looking to do
>> interesting/unsupported things with Arrow may be interested in the
>> presentation.
>>
>> https://techascent.com/blog/memory-mapping-arrow.html
>>
>> Chris
>>
>

Mime
View raw message