asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Maxon <ima...@uci.edu>
Subject Re: The "real" ADM format
Date Wed, 08 Jun 2016 20:35:43 GMT
I guess I don't view the round-trippability in the same way then, all it
means to me is that I can scan/output the data, load it, and end up with
the same thing, not necessarily that I can load it without specifying the
types and get them anyway because they're inlined to the data. I think if
we want that the better thing to do would be to do something like mysqldump
(e.g. it dumps the metadata/types as an equivalent query basically). Also,
if we changed the format to conflict with the existing output of SocialGen
we'd have issues with current experiments and reproducing old results.

On Wed, Jun 8, 2016 at 1:17 PM, Chris Hillery <chillery@hillery.land> wrote:

> I think the answer there is "round-tripability", right? ADM is meant to
> exactly describe the data so that it can be reloaded in the same way it
> was. Someone correct me if that isn't a requirement of the format...
>
> Ceej
> On Jun 8, 2016 9:14 AM, "Ian Maxon" <imaxon@uci.edu> wrote:
>
> > Why should the type be intermingled with the data though when it isn't
> > strictly necessary? For example why do I care if someone used an int64 to
> > wrap something I know is actually a short integer, and so on. It also
> kind
> > of gets rid of the idea of ADM being a superset of JSON.
> >
> > On Tue, Jun 7, 2016 at 10:49 PM, Preston Carman <prestonc@apache.org>
> > wrote:
> >
> > > The interval type format has been finalized and is the same for AQL
> > > and ADM. Below is an example of the format:
> > >
> > > interval(date("01-01-2011"), date("02-02-2012"))
> > >
> > > The interval constructor now uses other data type constructors to
> > > recreate an interval. The type of interval is defined by the two
> > > matching arguments.
> > >
> > >
> > > On Tue, Jun 7, 2016 at 9:36 PM, Chris Hillery <chillery@hillery.land>
> > > wrote:
> > > > Ah, the other thing I forgot to mention is that I didn't include
> > interval
> > > > types, because I'm not sure about their current status. There was
> some
> > > > discussion on the list in January (subject "Round Tripping ADM
> Interval
> > > > Data") but I'm not sure where it ended up as far as the form of the
> > > > constructors, and whether that was AQL or ADM or both.
> > > >
> > > > Ceej
> > > > aka Chris Hillery
> > > >
> > > > On Tue, Jun 7, 2016 at 9:34 PM, Chris Hillery <chillery@hillery.land
> >
> > > wrote:
> > > >
> > > >> I started to create the current inventory of types, with the forms
> > > >> accepted / produced by the ADM parser, AQL parser, and ADM
> > > serialization.
> > > >> (I think we all agree that ADM parser and ADM serializer should be
> > 100%
> > > >> compatible.) Here it is:
> > > >>
> > > >>
> > > >>
> > >
> >
> https://docs.google.com/spreadsheets/d/1-11a9ETV1Bdh_bUm9_CszY4hEGJGbEBaVKUWrzeS-As/edit?usp=sharing
> > > >>
> > > >> I know this is not comprehensive (for instance, I'm pretty sure
> that a
> > > >> naked integer will be parsed by both ADM and AQL as an int64, so
> that
> > > form
> > > >> should be listed as an alternative) and I haven't verified that the
> > AQL
> > > >> parser forms in particular are accurate, but I think it's close.
> I've
> > > set
> > > >> it so anyone can edit that document, so please fill in the gaps if
> you
> > > know
> > > >> of any.
> > > >>
> > > >> We should also fill in the exact accepted forms for the various
> > derived
> > > >> types like the datetime, spatial, hex, and UUID types - eg., the
> valid
> > > >> forms of the double-quoted string in the duration() constructor is
> as
> > > >> specified by XML schema, and so on.
> > > >>
> > > >> Ceej
> > > >> aka Chris Hillery
> > > >>
> > > >> On Tue, Jun 7, 2016 at 8:53 PM, Chris Hillery <chillery@hillery.land
> >
> > > >> wrote:
> > > >>
> > > >>> If it's possible, I think it would be least confusing if the
> > serialized
> > > >>> ADM format was identical to the corresponding data constructors
in
> > > AQL. It
> > > >>> should be a goal IMHO that you can cut-and-paste an ADM file into
> the
> > > query
> > > >>> box in the web UI and the result would be the same as loading
the
> > .adm.
> > > >>>
> > > >>> For more specifics, I think we need to write out for each data
type
> > > what
> > > >>> the current ADM and AQL formats are, and then pick a final answer
> for
> > > the
> > > >>> type (which may possibly be different from either of the current
> > forms,
> > > >>> although I suspect not). That will he the spec, and we can update
> the
> > > two
> > > >>> parsers (and all the test cases) accordingly.
> > > >>>
> > > >>> I started an email thread sometime last year about something
> > similar; I
> > > >>> think it was about JSON serialization, but it at least had the
AQL
> > > side of
> > > >>> this story for all simple types, I believe.
> > > >>>
> > > >>> Ceej
> > > >>> aka Chris Hillery
> > > >>> On Jun 7, 2016 8:17 PM, "Ian Maxon" <imaxon@uci.edu> wrote:
> > > >>>
> > > >>>> Hi all,
> > > >>>> After my experience with having to fix a rather large ADM
file
> dump
> > > from
> > > >>>> a
> > > >>>> query to make it load back into the system I was compelled
to try
> my
> > > hand
> > > >>>> at making that not happen again. The first thing I tried my
hand
> at
> > > was
> > > >>>> basically what I did to make the file loadable but inside
the type
> > > >>>> printers; just remove all of the 'i32' and so on suffixes,
as well
> > as
> > > >>>> making decimals not formatted in scientific notation. This
is
> pretty
> > > easy
> > > >>>> to do as well, not a huge change code-wise (but obviously
I'll
> have
> > to
> > > >>>> fix
> > > >>>> all of the tests).
> > > >>>>
> > > >>>> This got me to think though, which is the format that we actually
> > > want?
> > > >>>> The
> > > >>>> current format that is output, or the format that we accept
in the
> > > >>>> loader?
> > > >>>> Since this is actually perhaps a language level change either
way
> I
> > > >>>> figured
> > > >>>> I should find consensus before spending more time on it.
> > > >>>>
> > > >>>> Thoughts/comments are appreciated.
> > > >>>>
> > > >>>> Thanks,
> > > >>>> - Ian
> > > >>>>
> > > >>>
> > > >>
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message