asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Carey <dtab...@gmail.com>
Subject Re: Round Tripping ADM Interval Data
Date Mon, 01 Feb 2016 00:41:19 GMT
Removal seems fine/cleaner to me...!

On 1/31/16 10:19 AM, Eldon Carman wrote:
> I submitted a code review for the external (phase 1) changes. The change
> includes a interval constructor that takes two arguments of the same type.
> For example:
>
> let $v1 := interval(date("2013-01-01"), date("20130505"))
> let $v2 := interval(time("00:01:01"), time("213901049+0800"))
> let $v3 := interval(datetime("2013-01-01T00:01:01"),
> datetime("20130505T213901049+0800"))
> return { "v1": $v1, "v2": $v2, "v3": $v3 }
>
>
> Do we want to keep the previous functions for declaring an interval? The
> these functions are unique for each interval type and support string
> arguments in addition to the specific type (date, time, datetime).
>
> let $v1 := interval-from-date(date("2013-01-01"), date("20130505"))
> let $v2 := interval-from-time(time("00:01:01"), time("213901049+0800"))
> let $v3 := interval-from-datetime(datetime("2013-01-01T00:01:01"),
> datetime("20130505T213901049+0800"))
> return { "v1": $v1, "v2": $v2, "v3": $v3 }
>
> They could be helpful, but I am don't think they are necessary. Removing
> them would consolidate the code path for intervals. Thoughts?
>
>
> On Fri, Jan 29, 2016 at 12:15 AM, Mike Carey <dtabass@gmail.com> wrote:
>> +1 on all points below!  :-)
>>
>>
>> On 1/28/16 11:00 PM, Till Westmann wrote:
>>> Sounds good to me. It will change the binary representation of
> intervals, so it’s not backwards compatible.
>>> But it seems that the tag-first representation is the way it should have
> been anyway and it’s a better way to go forward to support more generic
> intervals. (And I prefer layout changes rather now than later ..)
>>> Cheers,
>>> Till
>>>
>>> On 28 Jan 2016, at 21:46, Eldon Carman wrote:
>>>
>>>> Based on the discussion, I have suggestion for a two phase approach to
>>>> support generic intervals.
>>>>
>>>>
>>>> Phase 1: External
>>>> The first phase will focus on areas seen by a user, either though ADM or
>>>> AQL. The new format will also be open to supporting other interval
> types,
>>>> although currently will only support date, time and datetime. Here are a
>>>> few of the action items:
>>>>
>>>> - ADM printer and parser changed to support the new generic style format
>>>>   - interval(date("2012-01-01”), date(”2013-04-01”))
>>>> - Add an interval AQL constructor to support the above format
>>>> - Alter interval byte structure to support any interval type
>>>> - byte tag, T start, T end
>>>> - where T is currently only date, time and datetime
>>>>
>>>> I created a ticket to track status for Phase 1:
>>>> https://issues.apache.org/jira/browse/ASTERIXDB-1281
>>>>
>>>>
>>>> Phase 2: Internal
>>>> Phase two will focus on items under the hood. These items are only seen
> by
>>>> a developer and need to be update to support new interval types. The
>>>> complete task list will require some investigation. In addition to
> picking
>>>> what types of intervals should be supported.
>>>>
>>>> Phase 1 can start immediately while Phase 2 can wait until needed.
>>>>
>>>> Thoughts?
>>>>
>>>>
>>>>
>>>> On Tue, Jan 26, 2016 at 2:14 PM, Eldon Carman <ecarm002@ucr.edu> wrote:
>>>>
>>>>> While its a little more work to implement up front, the following
> format
>>>>> would be generic and support alternate interval types in the future.
It
>>>>> would be nice to have consistency between AQL and ADM, although not
>>>>> required.
>>>>>
>>>>> interval(date("2012-01-01”), date(”2013-04-01”))
>>>>>
>>>>> As we move to supporting generic intervals, the byte storage format
> will
>>>>> need to be updated. Currently an interval is represented by: start
> (long),
>>>>> end (long), type tag (byte). To support other types, the type tag
> should be
>>>>> at the beginning of the byte sequence. This way the tag can be used to
>>>>> determine the data length of each item in the interval.
>>>>>
>>>>> Should the changes to AQL and ADM include this interval storage change
>>>>> (moving the type tag to the first byte of the interval storage format)?
>>>>>
>>>>> On Tue, Jan 26, 2016 at 12:42 PM, Till Westmann <tillw@apache.org>
> wrote:
>>>>>> That’s actually a nice and generic serialization.
>>>>>> I think that we should do this similarly in ADM and AQL.
>>>>>> I.e. instead of using
>>>>>>
>>>>>> interval-from-date("2012-01-01”, ”2013-04-01”)
>>>>>>
>>>>>> (note the two parameters) in AQL and
>>>>>>
>>>>>> interval-date("2012-01-01, 2013-04-01")
>>>>>>
>>>>>> (not the single parameter) in ADM we should use
>>>>>>
>>>>>> interval(date("2012-01-01”), date(”2013-04-01”))
>>>>>>
>>>>>> for both. That would have a number of advantages:
>>>>>>
>>>>>> 1) It is consistent between AQL and ADM.
>>>>>> 2) It is consistent with the JSON serialization.
>>>>>> 3) It reduces the number of magic parsers.
>>>>>> 4) It keeps the interval orthogonal to the type used in the interval.
>>>>>>
>>>>>> On 4): While we don’t support intervals of other types than date,
> time,
>>>>>> and datetime so far, I think that we should change that and so this
> would
>>>>>> be a good step in that direction as well.
>>>>>>
>>>>>> The disadvantages are
>>>>>>
>>>>>> 1) Incompatible AQL change
>>>>>> 2) Incompatible ADM change
>>>>>>
>>>>>> Thoughts?
>>>>>>
>>>>>> Cheers,
>>>>>> Till
>>>>>>
>>>>>>
>>>>>> On 26 Jan 2016, at 11:46, Eldon Carman wrote:
>>>>>>
>>>>>> I found that the lossless-JSON and clean-JSON printers were not being
>>>>>>> used.
>>>>>>> After connecting them to the respective JSON printer, I ran the
query
>>>>>>> again.
>>>>>>>
>>>>>>> lossless-JSON result:
>>>>>>> { "orderedlist": [ { "date-interval": { "interval": { "start":
{
> "date":
>>>>>>> "2012-01-01" }, "end": { "date": "2013-04-01" }}} }, {
> "time-interval": {
>>>>>>> "interval": { "start": { "time": "12:23:34.456Z" }, "end": {
"time":
>>>>>>> "15:34:45.567Z" }}} }, { "datetime-interval": { "interval": {
> "start": {
>>>>>>> "datetime": "2012-01-01T04:23:34.456Z" }, "end": { "datetime":
>>>>>>> "2013-04-01T15:34:45.567Z" }}} } ] }
>>>>>>>
>>>>>>>
>>>>>>> clean-JSON result:
>>>>>>> [ { "date-interval": { "interval": { "start": "2012-01-01", "end":
>>>>>>> "2013-04-01"}} }, { "time-interval": { "interval": { "start":
>>>>>>> "12:23:34.456Z", "end": "15:34:45.567Z"}} }, { "datetime-interval":
{
>>>>>>> "interval": { "start": "2012-01-01T04:23:34.456Z", "end":
>>>>>>> "2013-04-01T15:34:45.567Z"}} } ]
>>>>>>>
>>>>>>> Is this what you would have expected?
>>>>>>>
>>>>>>> On Mon, Jan 25, 2016 at 7:07 PM, Eldon Carman <ecarm002@ucr.edu>
> wrote:
>>>>>>>> Thanks Chris for adding a fourth option. This option would
focus our
>>>>>>>>
>>>>>>> updates to only the ADM output.
>>>>>>>
>>>>>>>> Yes, both lossless-JSON and clean-JSON outputs would need
to be
> check
>>>>>>> also.
>>>>>>>
>>>>>>>> On Mon, Jan 25, 2016 at 5:58 PM, Chris Hillery
> <chillery@hillery.land>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>> I would vote for:
>>>>>>>>>
>>>>>>>>> d. Update the serialized format to output "interval-from-date"
and
> put
>>>>>>>> both
>>>>>>>
>>>>>>>> dates in quotes.
>>>>>>>>>
>>>>>>>>> I like the function name interval-from-date() better,
and I don't
> think
>>>>>>>>> there's any need to maintain backwards compatibility
with the old
> name
>>>>>>>>> which clearly never worked.
>>>>>>>>>
>>>>>>>>> Couple thoughts, though: The serialized format really
should be
> "ADM",
>>>>>>>> not
>>>>>>>
>>>>>>>> "AQL". As such I don't think it should reference functions
at all.
> We
>>>>>>>>> already do this for many datatypes, such as uuid("...")
and
>>>>>>>>> datetime("..."). Are those truly "Functions"? Are they
> "constructors",
>>>>>>>> and
>>>>>>>
>>>>>>>> is that different? In any case, the answer for interval types
> should be
>>>>>>>>> consistent with that.
>>>>>>>>>
>>>>>>>>> Final note: quite possibly the lossless-JSON and clean-JSON
> outputs for
>>>>>>>>> intervals are broken as well, and should be fixed.
>>>>>>>>>
>>>>>>>>> Ceej
>>>>>>>>> aka Chris Hillery
>>>>>>>>>
>>>>>>>>> On Mon, Jan 25, 2016 at 5:36 PM, Till Westmann <tillw@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Voting for a. Seems to be the least redundant option.
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Till
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 25 Jan 2016, at 16:47, Eldon Carman wrote:
>>>>>>>>>>
>>>>>>>>>> The interval field value printed in the ADM results
can not be
> used to
>>>>>>>>>>> create an interval.
>>>>>>>>>>>
>>>>>>>>>>> Intervals have several functions that are used
to construct an
>>>>>>>>>>>
>>>>>>>>>> interval:
>>>>>>>
>>>>>>>> interval-from-date/time/datetime
>>>>>>>>>>> and interval-start-from-date/time/datetime. It
appears that this
> is
>>>>>>>>>> the
>>>>>>>
>>>>>>>> only way to create an interval. Thus, a user must use one
of these
>>>>>>>>>>> function
>>>>>>>>>>> to create an interval.
>>>>>>>>>>>
>>>>>>>>>>> The following query shows how to create three
intervals.
>>>>>>>>>>>
>>>>>>>>>>> Query:
>>>>>>>>>>> let $di := {"date-interval": interval-from-date("2012-01-01",
>>>>>>>>>>> "2013-04-01")}
>>>>>>>>>>> let $ti := {"time-interval": interval-from-time("12:23:34.456Z",
>>>>>>>>>>> "233445567+0800")}
>>>>>>>>>>> let $dti := {"datetime-interval":
>>>>>>>>>>> interval-from-datetime("2012-01-01T12:23:34.456+08:00",
>>>>>>>>>>> "20130401T153445567Z")}
>>>>>>>>>>> return [$di, $ti, $dti];
>>>>>>>>>>>
>>>>>>>>>>> Result:
>>>>>>>>>>> { "date-interval": interval-date("2012-01-01,
2013-04-01") }, {
>>>>>>>>>>> "time-interval": interval-time("12:23:34.456Z,
15:34:45.567Z")
> }, {
>>>>>>>>>>> "datetime-interval": interval-datetime("2012-01-01T04:23:34.456Z,
>>>>>>>>>>> 2013-04-01T15:34:45.567Z") } ]
>>>>>>>>>>>
>>>>>>>>>>> Notice the results show interval-date("date,
date") which is
>>>>>>>>>>> different
>>>>>>>>>>> than
>>>>>>>>>>> the functions that are used to create a date
interval. Notice
> that
>>>>>>>>>>> interval-date does not exists in AsterixDB and
that the input is
> a
>>>>>>>>>> single
>>>>>>>
>>>>>>>> string of dates separated by a comma. Below are some ideas
on how to
>>>>>>>>>>> create
>>>>>>>>>>> a round-trip for intervals.
>>>>>>>>>>>
>>>>>>>>>>> Options for round tripping:
>>>>>>>>>>> a: Rename "interval-from-date" to "interval-date"
and update the
>>>>>>>>>>>
>>>>>>>>>> output to
>>>>>>>
>>>>>>>> put both dates in quotes.
>>>>>>>>>>> b: Add alias for "interval-from-date" to "interval-date"
and
> update
>>>>>>>>>> the
>>>>>>>
>>>>>>>> output to put both dates in quotes.
>>>>>>>>>>> c: Create an interval date constructor (called
interval-date)
> that
>>>>>>>>>>> can
>>>>>>>>>>> parse the string "date, date".
>>>>>>>>>>>
>>>>>>>>>>> The same process should be used for intervals
with time and
> datetime.
>>>>>>>>>>> Thoughts?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message