arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Micah Kornfield <emkornfi...@gmail.com>
Subject Re: [Discuss] Array Cast Kernels Support Matrix
Date Tue, 05 Mar 2019 07:21:25 GMT
Hi Neville,
In case it helps you do some digging most of the allowed casts in C++ can
be found at [1].

* It does support Uft8 to boolean but I don't believe it does not boolean
to utf8
* It looks like it does support casting List to List.
* It doesn't support Struct to struct

In general, I'm not sure consistency is  important between different
implementations of compute engines (unless they all have the goal of
implementing some standard like one of the SQL-XX).    So it might be nice,
but I don't think we should be rigorous about it.  I'd like to hear other
opinions on this though.

In C++ at least,  I think some of the missing casts could likely be handled
by other kernels (if not added directly to casts).

Thanks,
Micah

[1]
https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernels/generated/cast-codegen-internal.h

On Mon, Mar 4, 2019 at 10:08 AM Neville Dipale <nevilledips@gmail.com>
wrote:

> Hi Arrow devs,
>
> I'm currently adding support for casting arrays in Rust, and I'm wondering
> what casting operations should be supported, and how. Most operations are
> simple, but I have a few questions below.
>
> * Struct to Struct: I am not supporting in Rust as it might not make
> sense/be easy to support. Is that fine?
>
> * List of type A to List of type B: does it make sense to support casting
> as long as the underlying types can be cast to each other? I'm thinking of
> casting a list of u32 to list of i32
>
> * Boolean to Decimal: Apache Impala doesn't support this (
>
> https://impala.apache.org/docs/build/html/topics/impala_boolean.html).Should
> we follow suit here?
>
> * Boolean to Utf8: Impala casts 'true' to '1', what are Arrow
> implementations doing? I'll also have a look at the CPP codebase.
>
> * Utf8 to Boolean: Impala disallows this, but a case could be made for
> supporting this, with the cast operation being limited to (true/false),
> (T/F) and what CSV readers infer to be true or false. This could be useful
> when reading CSV files (in Rust)
>
> * Primitive to List: I was thinking of creating a list with 1 value for the
> primitive (provided the list type is compatible with the from primitive).
> Is this too extreme? We could perhaps leave this out and support it someday
> in array operations
>
> With regards to temporal arrays, should casting date and time to primitive
> types be supported? The inverse makes sense as I might have an Int32Array
> with millisecond values that I want to cast to a Timestamp or a Date32.
>
> If there's interest/benefit in documenting the above for future consistency
> among the various languages, I don't mind documenting something in the
> coming days/weeks.
>
> Thanks and Regards
>
> Neville
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message