asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eldon Carman <ecarm...@ucr.edu>
Subject Re: Round Tripping ADM Interval Data
Date Tue, 26 Jan 2016 22:14:45 GMT
While its a little more work to implement up front, the following format
would be generic and support alternate interval types in the future. It
would be nice to have consistency between AQL and ADM, although not
required.

interval(date("2012-01-01”), date(”2013-04-01”))

As we move to supporting generic intervals, the byte storage format will
need to be updated. Currently an interval is represented by: start (long),
end (long), type tag (byte). To support other types, the type tag should be
at the beginning of the byte sequence. This way the tag can be used to
determine the data length of each item in the interval.

Should the changes to AQL and ADM include this interval storage change
(moving the type tag to the first byte of the interval storage format)?

On Tue, Jan 26, 2016 at 12:42 PM, Till Westmann <tillw@apache.org> wrote:

> That’s actually a nice and generic serialization.
> I think that we should do this similarly in ADM and AQL.
> I.e. instead of using
>
>   interval-from-date("2012-01-01”, ”2013-04-01”)
>
> (note the two parameters) in AQL and
>
>   interval-date("2012-01-01, 2013-04-01")
>
> (not the single parameter) in ADM we should use
>
>   interval(date("2012-01-01”), date(”2013-04-01”))
>
> for both. That would have a number of advantages:
>
> 1) It is consistent between AQL and ADM.
> 2) It is consistent with the JSON serialization.
> 3) It reduces the number of magic parsers.
> 4) It keeps the interval orthogonal to the type used in the interval.
>
> On 4): While we don’t support intervals of other types than date, time,
> and datetime so far, I think that we should change that and so this would
> be a good step in that direction as well.
>
> The disadvantages are
>
> 1) Incompatible AQL change
> 2) Incompatible ADM change
>
> Thoughts?
>
> Cheers,
> Till
>
>
> On 26 Jan 2016, at 11:46, Eldon Carman wrote:
>
> I found that the lossless-JSON and clean-JSON printers were not being used.
>> After connecting them to the respective JSON printer, I ran the query
>> again.
>>
>> lossless-JSON result:
>> { "orderedlist": [ { "date-interval": { "interval": { "start": { "date":
>> "2012-01-01" }, "end": { "date": "2013-04-01" }}} }, { "time-interval": {
>> "interval": { "start": { "time": "12:23:34.456Z" }, "end": { "time":
>> "15:34:45.567Z" }}} }, { "datetime-interval": { "interval": { "start": {
>> "datetime": "2012-01-01T04:23:34.456Z" }, "end": { "datetime":
>> "2013-04-01T15:34:45.567Z" }}} } ] }
>>
>>
>> clean-JSON result:
>> [ { "date-interval": { "interval": { "start": "2012-01-01", "end":
>> "2013-04-01"}} }, { "time-interval": { "interval": { "start":
>> "12:23:34.456Z", "end": "15:34:45.567Z"}} }, { "datetime-interval": {
>> "interval": { "start": "2012-01-01T04:23:34.456Z", "end":
>> "2013-04-01T15:34:45.567Z"}} } ]
>>
>> Is this what you would have expected?
>>
>> On Mon, Jan 25, 2016 at 7:07 PM, Eldon Carman <ecarm002@ucr.edu> wrote:
>>
>>>
>>> Thanks Chris for adding a fourth option. This option would focus our
>>>
>> updates to only the ADM output.
>>
>>>
>>> Yes, both lossless-JSON and clean-JSON outputs would need to be check
>>>
>> also.
>>
>>>
>>> On Mon, Jan 25, 2016 at 5:58 PM, Chris Hillery <chillery@hillery.land>
>>>
>> wrote:
>>
>>>
>>>> I would vote for:
>>>>
>>>> d. Update the serialized format to output "interval-from-date" and put
>>>>
>>> both
>>
>>> dates in quotes.
>>>>
>>>> I like the function name interval-from-date() better, and I don't think
>>>> there's any need to maintain backwards compatibility with the old name
>>>> which clearly never worked.
>>>>
>>>> Couple thoughts, though: The serialized format really should be "ADM",
>>>>
>>> not
>>
>>> "AQL". As such I don't think it should reference functions at all. We
>>>> already do this for many datatypes, such as uuid("...") and
>>>> datetime("..."). Are those truly "Functions"? Are they "constructors",
>>>>
>>> and
>>
>>> is that different? In any case, the answer for interval types should be
>>>> consistent with that.
>>>>
>>>> Final note: quite possibly the lossless-JSON and clean-JSON outputs for
>>>> intervals are broken as well, and should be fixed.
>>>>
>>>> Ceej
>>>> aka Chris Hillery
>>>>
>>>> On Mon, Jan 25, 2016 at 5:36 PM, Till Westmann <tillw@apache.org>
>>>> wrote:
>>>>
>>>> Voting for a. Seems to be the least redundant option.
>>>>>
>>>>> Cheers,
>>>>> Till
>>>>>
>>>>>
>>>>> On 25 Jan 2016, at 16:47, Eldon Carman wrote:
>>>>>
>>>>> The interval field value printed in the ADM results can not be used to
>>>>>
>>>>>> create an interval.
>>>>>>
>>>>>> Intervals have several functions that are used to construct an
>>>>>>
>>>>> interval:
>>
>>> interval-from-date/time/datetime
>>>>>> and interval-start-from-date/time/datetime. It appears that this
is
>>>>>>
>>>>> the
>>
>>> only way to create an interval. Thus, a user must use one of these
>>>>>> function
>>>>>> to create an interval.
>>>>>>
>>>>>> The following query shows how to create three intervals.
>>>>>>
>>>>>> Query:
>>>>>> let $di := {"date-interval": interval-from-date("2012-01-01",
>>>>>> "2013-04-01")}
>>>>>> let $ti := {"time-interval": interval-from-time("12:23:34.456Z",
>>>>>> "233445567+0800")}
>>>>>> let $dti := {"datetime-interval":
>>>>>> interval-from-datetime("2012-01-01T12:23:34.456+08:00",
>>>>>> "20130401T153445567Z")}
>>>>>> return [$di, $ti, $dti];
>>>>>>
>>>>>> Result:
>>>>>> { "date-interval": interval-date("2012-01-01, 2013-04-01") }, {
>>>>>> "time-interval": interval-time("12:23:34.456Z, 15:34:45.567Z") },
{
>>>>>> "datetime-interval": interval-datetime("2012-01-01T04:23:34.456Z,
>>>>>> 2013-04-01T15:34:45.567Z") } ]
>>>>>>
>>>>>> Notice the results show interval-date("date, date") which is different
>>>>>> than
>>>>>> the functions that are used to create a date interval. Notice that
>>>>>> interval-date does not exists in AsterixDB and that the input is
a
>>>>>>
>>>>> single
>>
>>> string of dates separated by a comma. Below are some ideas on how to
>>>>>> create
>>>>>> a round-trip for intervals.
>>>>>>
>>>>>> Options for round tripping:
>>>>>> a: Rename "interval-from-date" to "interval-date" and update the
>>>>>>
>>>>> output to
>>
>>> put both dates in quotes.
>>>>>> b: Add alias for "interval-from-date" to "interval-date" and update
>>>>>>
>>>>> the
>>
>>> output to put both dates in quotes.
>>>>>> c: Create an interval date constructor (called interval-date) that
can
>>>>>> parse the string "date, date".
>>>>>>
>>>>>> The same process should be used for intervals with time and datetime.
>>>>>>
>>>>>> Thoughts?
>>>>>>
>>>>>>
>>>>>
>>>
>>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message