arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacques Nadeau <jacq...@apache.org>
Subject Re: Bulk copy methods to/from Java vectors
Date Sun, 26 Jul 2020 16:30:24 GMT
On Sun, Jul 26, 2020 at 8:02 AM Chris Nuernberger <chris@techascent.com>
wrote:

> It appears that those methods do not allocate the validity buffer *and*
> the function `allocateValidityBuffer` is private.
>

It allocates both of them at once. To reduce heap usage we colocate them
since they are never resized indepently.


Also it appears that allocate new fails to set the value count for
> BaseVariableWidthVectors.  And if you set the value count after you have
> assigned data then it clears *only* the offset buffer but not the validity
> or the data buffers.


For direct operations on variable, you'll need to do the following steps:
1) allocateNew,
2) copy in data via memory operations,
3) call setLastSet()
4) call setValueCount()

I'm guessing you skipped #3 and then setValueCount sees that you never set
any values so it propagates the the last offset to the value count. This is
done so you can do something like:
set(1,...)
set(3,...)
setValueCount(7)
and then 4-6 ordinal positions will be offset filled even though you didn't
set them explicitly. If you do your own work, you have to help the state
model in the variable vector understand what you've done.

Mime
View raw message