asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Till Westmann" <ti...@apache.org>
Subject Re: Homogeneous lists with nullable items
Date Fri, 18 Dec 2015 09:57:47 GMT
Hi Ildar,

it seems that we have 2 separate points here:
1) There are bugs in the way we decide which list representation to use 
and
2) we could add support for (and an optimized representation for) a list 
of a fixed but nullable type.
It seems that - by fixing 1) - we could get rid of the issues you’ve 
listed.

But I also think that it would be nice to support lists of a nullable 
type (feels like an omission that we don’t support that in the 
language) - and potentially provide an efficient representation for 
them.
However, it is not clear to me how we would do this.
A few thoughts:
- Would we maintain the current representation for homogenous lists of 
non-nullable types?
- Would we introduce a new type tag for “nullable lists”?
- Would we redefine the current representation to mean something else?
Do you have thoughts on those?

Cheers,
Till

On 16 Dec 2015, at 8:12, Ildar Absalyamov wrote:

> Hi devs,
>
> Recently I have been playing around with lists and functions, which 
> receive/return list parameters/values. I have noticed one particular 
> issue, which seems to be incorrect.
> As you might know internally we do support 2 types of lists 
> homogeneous, where all the items are untagged and the item type is 
> stored in type definition, and heterogeneous, where items on contrary 
> are tagged, and the list item type is effectively ANY.
> The decision which of two types would be used is usually done by 
> parser or is altered by IntroduceEnforcedListTypeRule, which 
> effectively turns heterogenous list into homogenous if all the items 
> have the same type.
> Right now only we allow homogeneous lists to be defined as a field in 
> some type, we also restrict the item type to be only non-nullable 
> type:
> create type listType {
> “id”:int64,
> “list”:[int64]   // [int64?] is not possible
> }
>
> This constraint spans both of the language level as well as 
> serialization. Under that restriction the only way to load the list, 
> which contains null values, would be to make the appropriate field 
> open (open lists are heterogenous by definition).
>
> 1) Seems like we’re missing an optimization opportunity when we are 
> dealing with large sparse lists. Serialization in this case might use 
> a bit mask to specify which items in the lists are not null, and later 
> encode only those items.
> 2) I believe if we alter IntroduceEnforcedListTypeRule to enforce list 
> to homogeneous list with nullable item type we might resolve issues 
> https://issues.apache.org/jira/browse/ASTERIXDB-905, 
> https://issues.apache.org/jira/browse/ASTERIXDB-867, 
> https://issues.apache.org/jira/browse/ASTERIXDB-1131all at once.
>
> Thoughts?
>
> Best regards,
> Ildar

Mime
View raw message