asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Carey <dtab...@gmail.com>
Subject Re: Homogeneous lists with nullable items
Date Mon, 21 Dec 2015 08:18:20 GMT
0 is a legit value for some uses of numbers, so we do need an
out-of-band value.  :-)

On 12/19/15 10:16 PM, Wail Alkowaileet wrote:
> I have a small thought on that one ... would 0=null for numerical sparse
> list? or would better to extend the complex types to have "vectors and
> matrices" ?
>
> On Fri, Dec 18, 2015 at 5:45 PM, Mike Carey <dtabass@gmail.com> wrote:
>
>> Agreed.  We probably need a mini design doc here. The short term urgency
>> seems to be a need to represent lists that can include nulls, as this is
>> blocking JPL and is also something easily produced by queries (AQL or
>> SQL++).  Longer term one can imagine where this would be something that
>> might vary (at the lowest level of detail) by list, e.g., you might
>> represent dense and sparse lists quite differently, you might use
>> compression for certain kinds of lists, etc.
>>
>>
>> On 12/18/15 1:57 AM, Till Westmann wrote:
>>
>>> Hi Ildar,
>>>
>>> it seems that we have 2 separate points here:
>>> 1) There are bugs in the way we decide which list representation to use
>>> and
>>> 2) we could add support for (and an optimized representation for) a list
>>> of a fixed but nullable type.
>>> It seems that - by fixing 1) - we could get rid of the issues you’ve
>>> listed.
>>>
>>> But I also think that it would be nice to support lists of a nullable
>>> type (feels like an omission that we don’t support that in the language) -
>>> and potentially provide an efficient representation for them.
>>> However, it is not clear to me how we would do this.
>>> A few thoughts:
>>> - Would we maintain the current representation for homogenous lists of
>>> non-nullable types?
>>> - Would we introduce a new type tag for “nullable lists”?
>>> - Would we redefine the current representation to mean something else?
>>> Do you have thoughts on those?
>>>
>>> Cheers,
>>> Till
>>>
>>> On 16 Dec 2015, at 8:12, Ildar Absalyamov wrote:
>>>
>>> Hi devs,
>>>> Recently I have been playing around with lists and functions, which
>>>> receive/return list parameters/values. I have noticed one particular issue,
>>>> which seems to be incorrect.
>>>> As you might know internally we do support 2 types of lists homogeneous,
>>>> where all the items are untagged and the item type is stored in type
>>>> definition, and heterogeneous, where items on contrary are tagged, and the
>>>> list item type is effectively ANY.
>>>> The decision which of two types would be used is usually done by parser
>>>> or is altered by IntroduceEnforcedListTypeRule, which effectively turns
>>>> heterogenous list into homogenous if all the items have the same type.
>>>> Right now only we allow homogeneous lists to be defined as a field in
>>>> some type, we also restrict the item type to be only non-nullable type:
>>>> create type listType {
>>>> “id”:int64,
>>>> “list”:[int64]   // [int64?] is not possible
>>>> }
>>>>
>>>> This constraint spans both of the language level as well as
>>>> serialization. Under that restriction the only way to load the list, which
>>>> contains null values, would be to make the appropriate field open (open
>>>> lists are heterogenous by definition).
>>>>
>>>> 1) Seems like we’re missing an optimization opportunity when we are
>>>> dealing with large sparse lists. Serialization in this case might use a bit
>>>> mask to specify which items in the lists are not null, and later encode
>>>> only those items.
>>>> 2) I believe if we alter IntroduceEnforcedListTypeRule to enforce list
>>>> to homogeneous list with nullable item type we might resolve issues
>>>> https://issues.apache.org/jira/browse/ASTERIXDB-905,
>>>> https://issues.apache.org/jira/browse/ASTERIXDB-867,
>>>> https://issues.apache.org/jira/browse/ASTERIXDB-1131all at once.
>>>>
>>>> Thoughts?
>>>>
>>>> Best regards,
>>>> Ildar
>>>>
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message