You can allocate exactly for both fixed [1] and variable types [2]. 

1: https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/BaseFixedWidthVector.java#L292
2: https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/BaseVariableWidthVector.java#L401

You can then use the set method per cell or just grab the memory address (e.g. getDataBufferAddress()) and use Unsafe to bulk copy. The latter obviously is more advanced and requires you do things like set the validity buffers as well.


On Sat, Jul 25, 2020 at 6:02 AM Chris Nuernberger <chris@techascent.com> wrote:
Hey,

I would like to have bulk methods for copying data into a vector.  Specifically, I have an existing data table so I know the desired lengths of the columns.  I can also precalculate the necessary buffer sizes for any variable sized column.


What I don't see is how to pre-allocate columns of a given size.  When I use setValueCount on a column and then use the set method I get a netty error.  What I was hoping for is some allocation method, especially for primitive data, that allocates the desired uninitialized memory for the valide and buffer data and then hands those two buffers back to me so I can use memcpy and friends as opposed to repeated calls to setSafe.


Repeated calls to setSafe are time consuming, not parallelizable, and unnecessary when I know the data rectangle I would like to transfer into a record batch.


In my case I have the data pre-cut.  How would you recommend copying bulk portions of data (that may be in java arrays or in some more abstract interface) into a record batch?

Thanks for any help,

Chris