Mailing-List: contact dev-help@asterixdb.incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@asterixdb.incubator.apache.org
Message-ID: <55D58710.50407@gmail.com>
Date: Thu, 20 Aug 2015 00:51:44 -0700
From: Mike Carey <dtabass@gmail.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9;
 rv:31.0) Gecko/20100101 Thunderbird/31.7.0
MIME-Version: 1.0
To: dev@asterixdb.incubator.apache.org
Subject: Re: ADM round-trip no longer possible?
References: 
 <CAN_YF5w5axozN0XBmEMVnx8-nHwfSV22P7ieaxgqJNS4h+Jswg@mail.gmail.com>
 <CAHWyyRzBHqACRzBTxoe9YJ7_+Y2nHF-Ft9q28TMEM9i4QzK7xA@mail.gmail.com>
 <CAN_YF5zn21X6t7OsYkBv2CjUkt1jmvD1dAPpiNSbQ=9PbM+5aw@mail.gmail.com>
 <CAEEsEsTdwsgCuvkS-2wGaRyz9wVr__OHL+dS5MvY3Qn3cjUerg@mail.gmail.com>
 <CABytAGp3U=7gJWWTsy3hn61ZACahyOL-aX6-OCPPqQ9-jDLnyQ@mail.gmail.com>
 <CAN_YF5zt0oYt0MtZfrwoKXHGX--aN8Zo+cQTZy3-SLdfLayw_g@mail.gmail.com>
 <D0EF2293-916E-4033-9A93-5C81D9E9D42F@apache.org>
In-Reply-To: <D0EF2293-916E-4033-9A93-5C81D9E9D42F@apache.org>
Content-Type: multipart/alternative;
 boundary="------------020208050408000405060403"

--------------020208050408000405060403
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit

I was going to also say that other thing about the "i64" case.  :-)

W.r.t. the list wrapper, I prefer (a) because I don't think we should 
have the notion that the query output is a single data model instance - 
I kinda think we should have a model where the inputs and outputs can br 
multiple instances.  Otherwise we'll create a world that's potentially 
very unfriendly to streaming, etc.  If someone wants to deal with single 
lists, they can wrap the result into one.  (But they shouldn't do that 
if they want to create a load-friendly result.)  Note that this is all 
consistent with the persistence model we have, too - datasets are not 
really in the data model of instances - they are containers for (often 
large) collections of data model instances.

Cheers,
Mike

On 8/19/15 6:47 PM, Till Westmann wrote:
> I agree, that we don't need to print the 'd', but we probably also should accept (and ignore) it when parsing.
>
> Wrt the list wrapper, I think that we've added that to the HTTP API to ensure that we get a valid instance of the data model. So this would be an artifact of the HTTP interface and not part of the result. And I think that we shouldn't simply ignore it on bulkload, as somebody might actually want to load lists (even though I don't know what to load them into right now ...).
> I think that the 2 options on this are
> a) add an option to the HTTP interface to not wrap the output sequence in a list and
> b) add an option to the bulkload to ignore an enclosing list.
> Right now I prefer b).
>
> Thoughts?
>
> Cheers,
> Till
>
>
>
>> On Aug 19, 2015, at 18:14, Ian Maxon <imaxon@uci.edu> wrote:
>>
>> Interesting. I'm not sure I see the value in the suffix, as the default is
>> double.
>>
>> While we are at it in fixing this, we should remove the outer list wrapper
>> (or fix the ADM bulkload to accept that as input). Otherwise again, one
>> can't roundtrip.
>>
>> - Ian
>>
>>> On Wed, Aug 19, 2015 at 5:53 PM, Taewoo Kim <wangsaeu@gmail.com> wrote:
>>>
>>> In adm.grammar file, DOUBLE_LITERAL doesn't have information about "d"
>>> suffix unlike the other numeric cases. This means that, for the real DOUBLE
>>> values, the output of ADM shouldn't have "d" in it? The ADoublePrinter
>>> class is printing "d" as suffix when it generates a double output: ps
>>> .print(ADoubleSerializerDeserializer.getDouble(b, s + 1) + "d"); Which way
>>> is the correct way? We need to remove "d" suffix? The adm.grammar file
>>> assumes so.
>>>
>>>
>>> INT_LITERAL          = signOrNothing(), digitSequence()
>>>
>>> INT8_LITERAL         = token(INT_LITERAL), string(i8)
>>>
>>> INT16_LITERAL        = token(INT_LITERAL), string(i16)
>>>
>>> INT32_LITERAL        = token(INT_LITERAL), string(i32)
>>>
>>> INT64_LITERAL        = token(INT_LITERAL), string(i64)
>>>
>>>
>>> @EXPONENT            = caseInsensitiveChar(e), signOrNothing(),
>>> digitSequence()
>>>
>>>
>>> DOUBLE_LITERAL = signOrNothing(), char(.), digitSequence()
>>>
>>> DOUBLE_LITERAL = signOrNothing(), digitSequence(), char(.), digitSequence()
>>>
>>> DOUBLE_LITERAL = signOrNothing(), digitSequence(), char(.),
>>> digitSequence(), token(@EXPONENT)
>>>
>>> DOUBLE_LITERAL = signOrNothing(), digitSequence(), token(@EXPONENT)
>>>
>>>
>>> FLOAT_LITERAL = token(DOUBLE_LITERAL), caseInsensitiveChar(f)
>>>
>>>
>>>
>>> Best,
>>> Taewoo
>>>
>>>> On Wed, Aug 19, 2015 at 3:15 PM, Mike Carey <dtabass@gmail.com> wrote:
>>>>
>>>> We definitely need this to work.  😃
>>>>> On Aug 19, 2015 9:36 AM, "Ian Maxon" <imaxon@uci.edu> wrote:
>>>>>
>>>>> Yes I am just trying to bulk load from an ADM file that's the result of
>>>>> querying. It's annoying to have to mangle it with sed or similar, as
>>> its
>>>>> about 4GB.
>>>>> On Aug 19, 2015 2:41 AM, "Chris Hillery" <chillery@hillery.land>
>>> wrote:
>>>>>> I noticed that the output wasn't re-parseable as input last week when
>>>>>> coming up with the proposed JSON serialization. In particular, the
>>>>> numeric
>>>>>> suffixes like 0.0d, 15i8, 333333i32, and so on didn't parse as AQL
>>>>>> (although interestingly 32.5f does). I wasn't aware that it used to
>>>> work,
>>>>>> though. It's an odd disconnect between ADM and AQL, at the least. It
>>>>> sounds
>>>>>> like you're seeing the same issue when parsing as actual ADM?
>>>>>>
>>>>>> Ceej
>>>>>> aka Chris Hillery
>>>>>>
>>>>>>> On Wed, Aug 19, 2015 at 1:43 AM, Ian Maxon <imaxon@uci.edu> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>> Has ADM output from Asterix stopped being round-trippable? I'm
>>> trying
>>>>> to
>>>>>>> load a record I got from dumping from another instance, via the
>>> REST
>>>>> API,
>>>>>>> requesting 'application-x-adm' in the accept header. First I had to
>>>>>> remove
>>>>>>> the outer wrapper list that was added a while back, which I suppose
>>>>> isn't
>>>>>>> awful, but now it seems like I'm getting more subtle errors. For
>>>>> example,
>>>>>>> trying to load a record, and the parse fails once it sees a field
>>>> like
>>>>>> this
>>>>>>> in a record :
>>>>>>>
>>>>>>> { ... ,  "Rank": 0.0d, .... }
>>>>>>>
>>>>>>> With:
>>>>>>>
>>>>>>> Parse error at (1, 4421) expecting: <DOUBLE_CONS> <DATE_CONS>
>>>>>>> <DATETIME_CONS> <DURATION_CONS> <DAY_TIME_DURATION_CONS>
>>>>>>> [AdmLexerException]
>>>>>>>
>>>>>>> Any ideas?
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> - Ian


--------------020208050408000405060403--