arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joris Peeters <joris.mg.peet...@gmail.com>
Subject Re: [Java JDBC adapter] non-nullable fields?
Date Thu, 06 May 2021 07:09:27 GMT
Hello Fan,

Yes, but it seems that code path only affects the consumers, and whether
they set a value in the vector or not, see e.g.
https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/consumer/DoubleConsumer.java#L57
However, the VectorSchemaRoot's schema, defined I believe at
https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowVectorIterator.java#L59,
does not appear to use this info, and just sets every column's nullability
to true (as per the link in my original email).

Note that we are indeed using the ArrowVectorIterator, and it's when
iterating over the iterator and inspecting the schema of the elements
(VectorSchemaRoot) that I notice this.
Maybe all this needs is a `isColumnNullable(i, ..)` instead of `true` in
`final FieldType fieldType = new FieldType(true, arrowType, /* dictionary
encoding */ null, metadata);`.

Cheers,
-J

On Thu, May 6, 2021 at 5:53 AM Fan Liya <liya.fan03@gmail.com> wrote:

> Hi Joris,
>
> Thanks for reporting the problem.
>
> We make use of the nullable information in ArrowVectorIterator#initialize.
> (Details can be found in
> https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowVectorIterator.java#L73
> )
>
> Please note that the  ArrowVectorIterator is our encouraged way of using
> the JDBC adapter.
>
> Best,
> Liya Fan
>
>
> On Wed, May 5, 2021 at 1:42 PM Micah Kornfield <emkornfield@gmail.com>
> wrote:
>
>> I would need to look further, but I thought we handled null vs not null.
>> At least I thought we had specialized conversion code to avoid branches.
>> If this isn't the case it seems reasonable to contribute a path.
>>
>> On Tue, May 4, 2021 at 3:39 AM Joris Peeters <joris.mg.peeters@gmail.com>
>> wrote:
>>
>>> I'm looking to use the Java JDBC adapter for loading tables from SQL
>>> Server into Arrow record batches.
>>>
>>> At first glance the Arrow JDBC adapter seems to work well but, unless
>>> I'm mistaken, it simply makes every vector nullable, irrespective of
>>> whether the corresponding SQL column is nullable or not.
>>>
>>> I think the line
>>>
>>> final FieldType fieldType = new FieldType(true, arrowType, /* dictionary
>>> encoding */ null, metadata);
>>>
>>> in
>>> https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java#L158
>>> might be the cause here.
>>>
>>> Is my interpretation correct, or am I missing a setting of sorts? If
>>> indeed correct, is there a fundamental reason the NULL-ness is not
>>> transferred, or is this something I could contribute in a PR? (which I'd be
>>> happy to) I guess it's just a matter of inspecting the result metadata.
>>>
>>> Cheers,
>>> -J
>>>
>>

Mime
View raw message