calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From andrew <and...@primer.org>
Subject Re: UNION Type in Calcite
Date Thu, 17 Dec 2015 18:52:52 GMT
Julien,

I think the difference is that, whereas in a C union you have a number of named variables,
e.g. u.i, in our case we are dealing not with typed variable declarations, but rather with
the types themselves. You would not be able to reference into the union’s members with a
“.”; there are no members to reference.

As Jacques mentioned, I think it’s easier to think of this as akin to ANY. Maybe thinking
of it as ANY_OF(INT, ARRAY(INT)) makes it easier to consider.

If you want to keep the definition of UNION closer to that of a C struct, then perhaps we
can:

A) Add the ANY_OF type
B) Modify ANY to be parametrizable with zero of more types

- A


> On Dec 16, 2015, at 8:54 PM, Julian Hyde <jhyde@apache.org> wrote:
> 
> Jacques,
> 
> What kind of a union type were you thinking of? I was thinking of
> something like a C union, where you still need to use a field to
> indicate which sub-type you want. In C if you have
> 
>  typedef union {
>    int i;
>    double d;
>  } u;
> 
>  void foo(u);
>  void foo(int);
>  void foo(double);
> 
> and you write
> 
>  union u;
>  foo(u);
> 
> then the first "foo" gets called, and if you write
> 
>  foo(u.d);
> 
> then the last "foo" gets called.
> 
> The only difference between union and struct in C is that in the
> union, the members occupy the same storage. So what I'm proposing for
> Calcite is basically a struct.
> 
> Julian
> 
> 
> On Wed, Dec 16, 2015 at 6:36 PM, Jacques Nadeau <jacques@apache.org> wrote:
>> I don't think it would. We want to still do validation and we won't be
>> returning a struct, we'll be returning one or the other. Think of this as a
>> narrowing of the ANY type to a subset of known possibilities. Something
>> that fits a particular possibility should be allowed but for example, in
>> the (varbinary, varbinary[]) case, you should be able to use functions that
>> only support varbinary or varbinary[] but not functions that expect int or
>> varchar.
>> 
>> Without specifically stating c.i or c.ai, you wouldn't get any validation.
>> Additionally, I'd expect the validator to reject valid expressions such as
>> c + 4 (where c is the union field).
>> 
>> On Wed, Dec 16, 2015 at 4:23 PM, andrew <andrew@primer.org> wrote:
>> 
>>> That sounds like it might fit the bill. Thanks Julien.
>>> 
>>> 
>>>> On Dec 16, 2015, at 1:30 PM, Julian Hyde <jhyde.apache@gmail.com> wrote:
>>>> 
>>>> You could declare a STRUCT(i INT, ai ARRAY(INT)) and make sure exactly
>>> one of i and ai is set.
>>>> 
>>>>> On Dec 16, 2015, at 10:51 AM, andrew <andrew@primer.org> wrote:
>>>>> 
>>>>> I’m wondering if it would be possible to add a UNION type in Calcite.
>>>>> 
>>>>> My use case is that I am developing a backend storage engine using
>>> Calcite that only partially declares its schema. Specifically, the engine
>>> declares columns such as ‘int’ when it actually means the column can
>>> contain ‘int’ or ‘array of int’. There is no way to tell without actually
>>> reading the table if the data is scalar or array. Indeed it can be both.
>>>>> 
>>>>> My idea is that if Calcite had a UNION type, I could declare the
>>> columns as e.g. UNION(INT, ARRAY(INT)).
>>>>> 
>>>>> Does this sound reasonable? Or is there a better way of handling this
>>> situation using the current features of Calcite?
>>>>> 
>>>>> Thanks.
>>>>> - A
>>>>> 
>>>>> 
>>>> 
>>> 
>>> 


Mime
View raw message