arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacques Nadeau <jacq...@apache.org>
Subject Re: memory mapped record batches in Java
Date Sun, 26 Jul 2020 16:32:53 GMT
On Sun, Jul 26, 2020 at 5:52 AM Chris Nuernberger <chris@techascent.com>
wrote:

> Hmm, sounds reasonable enough.  I may be mistaken but it appears to me
> that the fact that the current code relies on mutably updating the vector
> schema root does preclude concurrent access or parallelized access to
> multiple record batches.  Potentially a map-batch method that returns a new
> vector-schema-root each time would work.
>

Yeah, you could do something like that. The issue you can see depending on
your vector/batch sizes is increased heap usage. The stream based design of
the current classes was built so that one minimized heap churn when working
with large pipelines.

Mime
View raw message