asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hillery <chill...@hillery.land>
Subject Re: ADM round-trip no longer possible?
Date Thu, 20 Aug 2015 07:59:29 GMT
The list wrapper was primarily introduced to make the JSON output more
sensible. I think it's important in the JSON case, because otherwise a user
would need to do some pre-parsing of the output to handle query results
with more than one instance. If that's not an issue for ADM parsing, then I
believe we could omit it when outputting ADM.

Ceej
aka Chris Hillery

On Thu, Aug 20, 2015 at 12:51 AM, Mike Carey <dtabass@gmail.com> wrote:

> I was going to also say that other thing about the "i64" case.  :-)
>
> W.r.t. the list wrapper, I prefer (a) because I don't think we should have
> the notion that the query output is a single data model instance - I kinda
> think we should have a model where the inputs and outputs can br multiple
> instances.  Otherwise we'll create a world that's potentially very
> unfriendly to streaming, etc.  If someone wants to deal with single lists,
> they can wrap the result into one.  (But they shouldn't do that if they
> want to create a load-friendly result.)  Note that this is all consistent
> with the persistence model we have, too - datasets are not really in the
> data model of instances - they are containers for (often large) collections
> of data model instances.
>
> Cheers,
> Mike
>
>
> On 8/19/15 6:47 PM, Till Westmann wrote:
>
>> I agree, that we don't need to print the 'd', but we probably also should
>> accept (and ignore) it when parsing.
>>
>> Wrt the list wrapper, I think that we've added that to the HTTP API to
>> ensure that we get a valid instance of the data model. So this would be an
>> artifact of the HTTP interface and not part of the result. And I think that
>> we shouldn't simply ignore it on bulkload, as somebody might actually want
>> to load lists (even though I don't know what to load them into right now
>> ...).
>> I think that the 2 options on this are
>> a) add an option to the HTTP interface to not wrap the output sequence in
>> a list and
>> b) add an option to the bulkload to ignore an enclosing list.
>> Right now I prefer b).
>>
>> Thoughts?
>>
>> Cheers,
>> Till
>>
>>
>>
>> On Aug 19, 2015, at 18:14, Ian Maxon <imaxon@uci.edu> wrote:
>>>
>>> Interesting. I'm not sure I see the value in the suffix, as the default
>>> is
>>> double.
>>>
>>> While we are at it in fixing this, we should remove the outer list
>>> wrapper
>>> (or fix the ADM bulkload to accept that as input). Otherwise again, one
>>> can't roundtrip.
>>>
>>> - Ian
>>>
>>> On Wed, Aug 19, 2015 at 5:53 PM, Taewoo Kim <wangsaeu@gmail.com> wrote:
>>>>
>>>> In adm.grammar file, DOUBLE_LITERAL doesn't have information about "d"
>>>> suffix unlike the other numeric cases. This means that, for the real
>>>> DOUBLE
>>>> values, the output of ADM shouldn't have "d" in it? The ADoublePrinter
>>>> class is printing "d" as suffix when it generates a double output: ps
>>>> .print(ADoubleSerializerDeserializer.getDouble(b, s + 1) + "d"); Which
>>>> way
>>>> is the correct way? We need to remove "d" suffix? The adm.grammar file
>>>> assumes so.
>>>>
>>>>
>>>> INT_LITERAL          = signOrNothing(), digitSequence()
>>>>
>>>> INT8_LITERAL         = token(INT_LITERAL), string(i8)
>>>>
>>>> INT16_LITERAL        = token(INT_LITERAL), string(i16)
>>>>
>>>> INT32_LITERAL        = token(INT_LITERAL), string(i32)
>>>>
>>>> INT64_LITERAL        = token(INT_LITERAL), string(i64)
>>>>
>>>>
>>>> @EXPONENT            = caseInsensitiveChar(e), signOrNothing(),
>>>> digitSequence()
>>>>
>>>>
>>>> DOUBLE_LITERAL = signOrNothing(), char(.), digitSequence()
>>>>
>>>> DOUBLE_LITERAL = signOrNothing(), digitSequence(), char(.),
>>>> digitSequence()
>>>>
>>>> DOUBLE_LITERAL = signOrNothing(), digitSequence(), char(.),
>>>> digitSequence(), token(@EXPONENT)
>>>>
>>>> DOUBLE_LITERAL = signOrNothing(), digitSequence(), token(@EXPONENT)
>>>>
>>>>
>>>> FLOAT_LITERAL = token(DOUBLE_LITERAL), caseInsensitiveChar(f)
>>>>
>>>>
>>>>
>>>> Best,
>>>> Taewoo
>>>>
>>>> On Wed, Aug 19, 2015 at 3:15 PM, Mike Carey <dtabass@gmail.com> wrote:
>>>>>
>>>>> We definitely need this to work.  😃
>>>>>
>>>>>> On Aug 19, 2015 9:36 AM, "Ian Maxon" <imaxon@uci.edu> wrote:
>>>>>>
>>>>>> Yes I am just trying to bulk load from an ADM file that's the result
>>>>>> of
>>>>>> querying. It's annoying to have to mangle it with sed or similar,
as
>>>>>>
>>>>> its
>>>>
>>>>> about 4GB.
>>>>>> On Aug 19, 2015 2:41 AM, "Chris Hillery" <chillery@hillery.land>
>>>>>>
>>>>> wrote:
>>>>
>>>>> I noticed that the output wasn't re-parseable as input last week when
>>>>>>> coming up with the proposed JSON serialization. In particular,
the
>>>>>>>
>>>>>> numeric
>>>>>>
>>>>>>> suffixes like 0.0d, 15i8, 333333i32, and so on didn't parse as
AQL
>>>>>>> (although interestingly 32.5f does). I wasn't aware that it used
to
>>>>>>>
>>>>>> work,
>>>>>
>>>>>> though. It's an odd disconnect between ADM and AQL, at the least.
It
>>>>>>>
>>>>>> sounds
>>>>>>
>>>>>>> like you're seeing the same issue when parsing as actual ADM?
>>>>>>>
>>>>>>> Ceej
>>>>>>> aka Chris Hillery
>>>>>>>
>>>>>>> On Wed, Aug 19, 2015 at 1:43 AM, Ian Maxon <imaxon@uci.edu>
wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>> Has ADM output from Asterix stopped being round-trippable?
I'm
>>>>>>>>
>>>>>>> trying
>>>>
>>>>> to
>>>>>>
>>>>>>> load a record I got from dumping from another instance, via the
>>>>>>>>
>>>>>>> REST
>>>>
>>>>> API,
>>>>>>
>>>>>>> requesting 'application-x-adm' in the accept header. First I
had to
>>>>>>>>
>>>>>>> remove
>>>>>>>
>>>>>>>> the outer wrapper list that was added a while back, which
I suppose
>>>>>>>>
>>>>>>> isn't
>>>>>>
>>>>>>> awful, but now it seems like I'm getting more subtle errors.
For
>>>>>>>>
>>>>>>> example,
>>>>>>
>>>>>>> trying to load a record, and the parse fails once it sees a field
>>>>>>>>
>>>>>>> like
>>>>>
>>>>>> this
>>>>>>>
>>>>>>>> in a record :
>>>>>>>>
>>>>>>>> { ... ,  "Rank": 0.0d, .... }
>>>>>>>>
>>>>>>>> With:
>>>>>>>>
>>>>>>>> Parse error at (1, 4421) expecting: <DOUBLE_CONS> <DATE_CONS>
>>>>>>>> <DATETIME_CONS> <DURATION_CONS> <DAY_TIME_DURATION_CONS>
>>>>>>>> [AdmLexerException]
>>>>>>>>
>>>>>>>> Any ideas?
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> - Ian
>>>>>>>>
>>>>>>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message