On Sun, Jul 26, 2020 at 8:02 AM Chris Nuernberger <chris@techascent.com> wrote:
It appears that those methods do not allocate the validity buffer *and* the function `allocateValidityBuffer` is private.

It allocates both of them at once. To reduce heap usage we colocate them since they are never resized indepently.


Also it appears that allocate new fails to set the value count for BaseVariableWidthVectors.  And if you set the value count after you have assigned data then it clears *only* the offset buffer but not the validity or the data buffers.

For direct operations on variable, you'll need to do the following steps: 
1) allocateNew, 
2) copy in data via memory operations, 
3) call setLastSet() 
4) call setValueCount()

I'm guessing you skipped #3 and then setValueCount sees that you never set any values so it propagates the the last offset to the value count. This is done so you can do something like:
set(1,...)
set(3,...)
setValueCount(7)
and then 4-6 ordinal positions will be offset filled even though you didn't set them explicitly. If you do your own work, you have to help the state model in the variable vector understand what you've done.