asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cameron Samak" <csa...@apache.org>
Subject RE: ADM round-trip no longer possible?
Date Thu, 12 Nov 2015 05:28:41 GMT
While in the general case having the JSON list wrapper is much more useful, the client I'm
working on has to unbox the result before returning the value when it knows the query must
return 0 or 1 value.

The generated queries that fit this category are usually aggregates or limit 1 queries, and
since the client is type safe, it cannot return a collection. In my opinion limit 1 behavior
shouldn't be different than a limit 5 query that has one result, but it might be worth thinking
about different behavior for aggregates. Different behavior for an end user of other clients
is common in this scenario (clients with some type safety tend to handle it transparently,
while many others have both "executeQuery" and "executeScalar" equivalents). I haven't looked
into whether this is typically handled on the database side or in the client libraries...

If I understand the streaming concern properly, I think that's equally easy to do with the
list wrapper in JSON. Some libraries return some form of iterable type which does not require
holding the entire array in memory.

Cameron

-----Original Message-----
From: Mike Carey [mailto:dtabass@gmail.com] 
Sent: Thursday, August 20, 2015 6:52 AM
To: dev@asterixdb.incubator.apache.org
Subject: Re: ADM round-trip no longer possible?

Ah, interesting.  I was envisioning result consumption as a loop on an input character stream
... While not empty (input) consume next JSON object.  This could then swallow - in a streaming
way - an arbitrary size result set - without a gargantuan result object having to be closed
before consumption.  Would that not be cleaner, or are there software library and layering
issues that would interfere with that simplistic view?
On Aug 20, 2015 1:00 AM, "Chris Hillery" <chillery@hillery.land> wrote:

> The list wrapper was primarily introduced to make the JSON output more 
> sensible. I think it's important in the JSON case, because otherwise a 
> user would need to do some pre-parsing of the output to handle query 
> results with more than one instance. If that's not an issue for ADM 
> parsing, then I believe we could omit it when outputting ADM.
>
> Ceej
> aka Chris Hillery
>
> On Thu, Aug 20, 2015 at 12:51 AM, Mike Carey <dtabass@gmail.com> wrote:
>
> > I was going to also say that other thing about the "i64" case.  :-)
> >
> > W.r.t. the list wrapper, I prefer (a) because I don't think we 
> > should
> have
> > the notion that the query output is a single data model instance - I
> kinda
> > think we should have a model where the inputs and outputs can br 
> > multiple instances.  Otherwise we'll create a world that's 
> > potentially very unfriendly to streaming, etc.  If someone wants to 
> > deal with single
> lists,
> > they can wrap the result into one.  (But they shouldn't do that if 
> > they want to create a load-friendly result.)  Note that this is all 
> > consistent with the persistence model we have, too - datasets are 
> > not really in the data model of instances - they are containers for 
> > (often large)
> collections
> > of data model instances.
> >
> > Cheers,
> > Mike
> >
> >
> > On 8/19/15 6:47 PM, Till Westmann wrote:
> >
> >> I agree, that we don't need to print the 'd', but we probably also
> should
> >> accept (and ignore) it when parsing.
> >>
> >> Wrt the list wrapper, I think that we've added that to the HTTP API 
> >> to ensure that we get a valid instance of the data model. So this 
> >> would be
> an
> >> artifact of the HTTP interface and not part of the result. And I 
> >> think
> that
> >> we shouldn't simply ignore it on bulkload, as somebody might 
> >> actually
> want
> >> to load lists (even though I don't know what to load them into 
> >> right now ...).
> >> I think that the 2 options on this are
> >> a) add an option to the HTTP interface to not wrap the output 
> >> sequence
> in
> >> a list and
> >> b) add an option to the bulkload to ignore an enclosing list.
> >> Right now I prefer b).
> >>
> >> Thoughts?
> >>
> >> Cheers,
> >> Till
> >>
> >>
> >>
> >> On Aug 19, 2015, at 18:14, Ian Maxon <imaxon@uci.edu> wrote:
> >>>
> >>> Interesting. I'm not sure I see the value in the suffix, as the 
> >>> default is double.
> >>>
> >>> While we are at it in fixing this, we should remove the outer list 
> >>> wrapper (or fix the ADM bulkload to accept that as input). 
> >>> Otherwise again, one can't roundtrip.
> >>>
> >>> - Ian
> >>>
> >>> On Wed, Aug 19, 2015 at 5:53 PM, Taewoo Kim <wangsaeu@gmail.com>
> wrote:
> >>>>
> >>>> In adm.grammar file, DOUBLE_LITERAL doesn't have information about "d"
> >>>> suffix unlike the other numeric cases. This means that, for the 
> >>>> real DOUBLE values, the output of ADM shouldn't have "d" in it? 
> >>>> The ADoublePrinter class is printing "d" as suffix when it 
> >>>> generates a double output: ps 
> >>>> .print(ADoubleSerializerDeserializer.getDouble(b, s + 1) + "d"); 
> >>>> Which way is the correct way? We need to remove "d" suffix? The 
> >>>> adm.grammar file assumes so.
> >>>>
> >>>>
> >>>> INT_LITERAL          = signOrNothing(), digitSequence()
> >>>>
> >>>> INT8_LITERAL         = token(INT_LITERAL), string(i8)
> >>>>
> >>>> INT16_LITERAL        = token(INT_LITERAL), string(i16)
> >>>>
> >>>> INT32_LITERAL        = token(INT_LITERAL), string(i32)
> >>>>
> >>>> INT64_LITERAL        = token(INT_LITERAL), string(i64)
> >>>>
> >>>>
> >>>> @EXPONENT            = caseInsensitiveChar(e), signOrNothing(),
> >>>> digitSequence()
> >>>>
> >>>>
> >>>> DOUBLE_LITERAL = signOrNothing(), char(.), digitSequence()
> >>>>
> >>>> DOUBLE_LITERAL = signOrNothing(), digitSequence(), char(.),
> >>>> digitSequence()
> >>>>
> >>>> DOUBLE_LITERAL = signOrNothing(), digitSequence(), char(.), 
> >>>> digitSequence(), token(@EXPONENT)
> >>>>
> >>>> DOUBLE_LITERAL = signOrNothing(), digitSequence(), 
> >>>> token(@EXPONENT)
> >>>>
> >>>>
> >>>> FLOAT_LITERAL = token(DOUBLE_LITERAL), caseInsensitiveChar(f)
> >>>>
> >>>>
> >>>>
> >>>> Best,
> >>>> Taewoo
> >>>>
> >>>> On Wed, Aug 19, 2015 at 3:15 PM, Mike Carey <dtabass@gmail.com>
> wrote:
> >>>>>
> >>>>> We definitely need this to work.  😃
> >>>>>
> >>>>>> On Aug 19, 2015 9:36 AM, "Ian Maxon" <imaxon@uci.edu>
wrote:
> >>>>>>
> >>>>>> Yes I am just trying to bulk load from an ADM file that's the

> >>>>>> result of querying. It's annoying to have to mangle it with
sed 
> >>>>>> or similar, as
> >>>>>>
> >>>>> its
> >>>>
> >>>>> about 4GB.
> >>>>>> On Aug 19, 2015 2:41 AM, "Chris Hillery" 
> >>>>>> <chillery@hillery.land>
> >>>>>>
> >>>>> wrote:
> >>>>
> >>>>> I noticed that the output wasn't re-parseable as input last week

> >>>>> when
> >>>>>>> coming up with the proposed JSON serialization. In particular,

> >>>>>>> the
> >>>>>>>
> >>>>>> numeric
> >>>>>>
> >>>>>>> suffixes like 0.0d, 15i8, 333333i32, and so on didn't parse
as 
> >>>>>>> AQL (although interestingly 32.5f does). I wasn't aware
that 
> >>>>>>> it used to
> >>>>>>>
> >>>>>> work,
> >>>>>
> >>>>>> though. It's an odd disconnect between ADM and AQL, at the 
> >>>>>> least. It
> >>>>>>>
> >>>>>> sounds
> >>>>>>
> >>>>>>> like you're seeing the same issue when parsing as actual
ADM?
> >>>>>>>
> >>>>>>> Ceej
> >>>>>>> aka Chris Hillery
> >>>>>>>
> >>>>>>> On Wed, Aug 19, 2015 at 1:43 AM, Ian Maxon <imaxon@uci.edu>
wrote:
> >>>>>>>>
> >>>>>>>> Hi,
> >>>>>>>> Has ADM output from Asterix stopped being round-trippable?

> >>>>>>>> I'm
> >>>>>>>>
> >>>>>>> trying
> >>>>
> >>>>> to
> >>>>>>
> >>>>>>> load a record I got from dumping from another instance,
via 
> >>>>>>> the
> >>>>>>>>
> >>>>>>> REST
> >>>>
> >>>>> API,
> >>>>>>
> >>>>>>> requesting 'application-x-adm' in the accept header. First
I 
> >>>>>>> had to
> >>>>>>>>
> >>>>>>> remove
> >>>>>>>
> >>>>>>>> the outer wrapper list that was added a while back,
which I
> suppose
> >>>>>>>>
> >>>>>>> isn't
> >>>>>>
> >>>>>>> awful, but now it seems like I'm getting more subtle errors.

> >>>>>>> For
> >>>>>>>>
> >>>>>>> example,
> >>>>>>
> >>>>>>> trying to load a record, and the parse fails once it sees
a 
> >>>>>>> field
> >>>>>>>>
> >>>>>>> like
> >>>>>
> >>>>>> this
> >>>>>>>
> >>>>>>>> in a record :
> >>>>>>>>
> >>>>>>>> { ... ,  "Rank": 0.0d, .... }
> >>>>>>>>
> >>>>>>>> With:
> >>>>>>>>
> >>>>>>>> Parse error at (1, 4421) expecting: <DOUBLE_CONS>
<DATE_CONS> 
> >>>>>>>> <DATETIME_CONS> <DURATION_CONS> <DAY_TIME_DURATION_CONS>

> >>>>>>>> [AdmLexerException]
> >>>>>>>>
> >>>>>>>> Any ideas?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> - Ian
> >>>>>>>>
> >>>>>>>
> >
>


Mime
View raw message