asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Till Westmann" <ti...@apache.org>
Subject Re: New Asterix REST API design
Date Sat, 16 Apr 2016 03:25:20 GMT
I think that it’s a trade-off. Either we do the work when the job is
evaluated or when the job is picked up. If we did it on pick-up, we 
could
pick it up more than once in different formats, but I don't think that 
many
applications would need that (the web console might as somebody sitting 
in
front of it might want to look at it). The nice thing about the current
solution is, that we can do the serialization easily in parallel and the
pickup can happen sequentially and we don't have to interleave that with
more computation.

On 15 Apr 2016, at 17:39, Mike Carey wrote:

> In a more perfect world, the query results would perhaps be persisted 
> in binary ADM form still, and would be just-in-time reformatted when 
> they are picked up for delivery back to the requester.  At least that 
> seems like it would be better... No?
>
> On 4/15/16 5:22 PM, Ildar Absalyamov wrote:
>> I agree that the example where CSV is embedded into return JSON looks 
>> quirky (and I am not the big fan of it either).
>> I believe the tradeoff here is following: do we want to keep number 
>> of API calls just to get the data minimum, or logically separate 
>> metadata (like plans, execution time metrics, etc) from the data on 
>> the endpoint level.
>> I have tried to address the former case, however left an option to 
>> make this logical separation if the user is wiling to do that (via 
>> include-results parameter). There is no real way to do it other way 
>> around, since the plans, etc are generated before query is scheduled 
>> and any results could be returned.
>>
>>> On Apr 15, 2016, at 17:13, Till Westmann <tillw@apache.org> wrote:
>>>
>>> Yes, this API is not ideal for "just getting the data". However, 
>>> Ildar’s
>>> goal was to separate the data from the HTML and to build an API that 
>>> can be
>>> the basis for the Web-interface - and I think that the API looks 
>>> good for
>>> that :)
>>>
>>> I'm wondering if an endpoint to get the data should be an option on 
>>> this one
>>> or a different endpoint. The reason is, that all of the additional 
>>> request
>>> metadata that we can ask for (plan, metrics, warnings, ..) cannot 
>>> easily be
>>> returned with such an API. An API that play well with curl might 
>>> even put
>>> the format into the URI, e.g.:
>>>
>>> curl http://host:19100/query/csv?statment=select+element+1+as+one; > 
>>> one.csv
>>>
>>> Thoughts? Trade-offs?
>>>
>>> Cheers,
>>> Till
>>>
>>> On 15 Apr 2016, at 16:48, Cameron Samak wrote:
>>>
>>>> That hop is exactly what I think should be (optionally) avoidable 
>>>> though
>>>> because
>>>>
>>>>
>>>>    1. The user still needs to parse both JSON (to get the URL) 
>>>> along with
>>>>    the other format (i.e. CSV)
>>>>
>>>>    Consider curl {myquery} > myoutput.csv. That's harder with the 
>>>> proposed
>>>>    API.
>>>>
>>>>    2. It's an unnecessary round trip back to the server (which, 
>>>> depending
>>>>    on the environment, can be significant esp. with quick queries).
>>>>
>>>>
>>>> Understood for the result distribution + serialization.
>>>>
>>>>
>>>> Cameron
>>>>
>>>> On Fri, Apr 15, 2016 at 4:24 PM, Till Westmann <tillw@apache.org> 
>>>> wrote:
>>>>
>>>>> I had a misunderstanding that I think I clarified now. I believed 
>>>>> that we
>>>>> don’t have the separation into tuples anymore after result 
>>>>> distribution and
>>>>> that we only have bytes that we pass to the client. In that case 
>>>>> limiting
>>>>> in
>>>>> the HTTP server would have had to choose between
>>>>> a) limiting based on the number of bytes or
>>>>> b) re-establishing tuple boundaries.
>>>>> However, even though result distribution has serialized the tuples 
>>>>> to
>>>>> whatever format (ADM, JSON, CSV), we still send frames and so we 
>>>>> should be
>>>>> able to separate the tuples (and limit the number that we return).
>>>>>
>>>>> So I think that it should be feasible to add that (feature creep 
>>>>> is coming
>>>>> ... :) )
>>>>>
>>>>> Cheers,
>>>>> Till
>>>>>
>>>>>
>>>>> On 15 Apr 2016, at 14:55, Mike Carey wrote:
>>>>>
>>>>> I read this much more simply:  Can we enhance the API, in the case 
>>>>> where
>>>>>> you start with a handle and know that the results are ready now,

>>>>>> to fetch
>>>>>> the results in blocks instead of as one giant result?  So still 
>>>>>> computing
>>>>>> the giant result - just not pushing it all back at once - seems 
>>>>>> like it
>>>>>> might help?
>>>>>>
>>>>>>
>>>>>> On 4/15/16 2:48 PM, Till Westmann wrote:
>>>>>>
>>>>>>> Hi Wail,
>>>>>>>
>>>>>>> I’m not completely sure that I understand how to implement
the 
>>>>>>> idea. If
>>>>>>> we
>>>>>>> do this only in the API, it might be tricky to get the 
>>>>>>> boundaries between
>>>>>>> records right (e.g. if we do indentation on the server). 
>>>>>>> However, if we
>>>>>>> want
>>>>>>> to push this into the query engine, we need to understand enough

>>>>>>> of the
>>>>>>> query/statements to put the limit clause in.
>>>>>>> Both approaches don't look great to me.
>>>>>>>
>>>>>>> What did you have in mind?
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Till
>>>>>>>
>>>>>>> On 15 Apr 2016, at 13:19, Wail Alkowaileet wrote:
>>>>>>>
>>>>>>> Hi Ildar,
>>>>>>>> I think if there's something I would love to have is getting

>>>>>>>> partial
>>>>>>>> result
>>>>>>>> instead of all result at once. This can be beneficial for

>>>>>>>> result
>>>>>>>> pagination. When I use AsterixDB UI, 50% of the time my tab

>>>>>>>> crashes (I
>>>>>>>> forget to limit the result).
>>>>>>>>
>>>>>>>> Thanks...
>>>>>>>>
>>>>>>>> On Fri, Apr 15, 2016 at 1:23 AM, Ildar Absalyamov <
>>>>>>>> ildar.absalyamov@gmail.com> wrote:
>>>>>>>>
>>>>>>>> Hi Devs,
>>>>>>>>> Recently there have been a number of conversations about
the 
>>>>>>>>> future of
>>>>>>>>> our
>>>>>>>>> REST (aka HTTP) API. I summarized these discussions in
an 
>>>>>>>>> outline of
>>>>>>>>> the
>>>>>>>>> new API design:
>>>>>>>>>
>>>>>>>>> https://cwiki.apache.org/confluence/display/ASTERIXDB/New+HTTP+API+Design
>>>>>>>>> <
>>>>>>>>> https://cwiki.apache.org/confluence/display/ASTERIXDB/New+HTTP+API+Design
>>>>>>>>>
>>>>>>>>>> .
>>>>>>>>>>
>>>>>>>>> The need to refactor existing API came from different

>>>>>>>>> directions (and
>>>>>>>>> from
>>>>>>>>> different people), and is explained in motivation section.

>>>>>>>>> Thus I
>>>>>>>>> believe
>>>>>>>>> it’s about the time to take an effort and improve existing

>>>>>>>>> API, so
>>>>>>>>> that it
>>>>>>>>> will not drag us down in the future. However during the

>>>>>>>>> transition
>>>>>>>>> step I
>>>>>>>>> believe it would be better to keep exiting API endpoints,
so 
>>>>>>>>> that we
>>>>>>>>> would
>>>>>>>>> not break people’s current experimental setup.
>>>>>>>>>
>>>>>>>>> It would be good to know feedback from the folks, who
have 
>>>>>>>>> been
>>>>>>>>> contributing to that part of the systems recently.
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>> Ildar
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> *Regards,*
>>>>>>>> Wail Alkowaileet
>>>>>>>>
>> Best regards,
>> Ildar
>>
>>

Mime
View raw message