Return-Path: X-Original-To: apmail-asterixdb-dev-archive@minotaur.apache.org Delivered-To: apmail-asterixdb-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8F81E19A2B for ; Sat, 16 Apr 2016 05:34:32 +0000 (UTC) Received: (qmail 83786 invoked by uid 500); 16 Apr 2016 05:34:32 -0000 Delivered-To: apmail-asterixdb-dev-archive@asterixdb.apache.org Received: (qmail 83735 invoked by uid 500); 16 Apr 2016 05:34:32 -0000 Mailing-List: contact dev-help@asterixdb.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@asterixdb.incubator.apache.org Delivered-To: mailing list dev@asterixdb.incubator.apache.org Received: (qmail 83722 invoked by uid 99); 16 Apr 2016 05:34:32 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 16 Apr 2016 05:34:32 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id A39BAC0DA6 for ; Sat, 16 Apr 2016 05:34:31 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.18 X-Spam-Level: * X-Spam-Status: No, score=1.18 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, WEIRD_PORT=0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx2-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id RPuXdpbVOEUS for ; Sat, 16 Apr 2016 05:34:28 +0000 (UTC) Received: from mail-oi0-f52.google.com (mail-oi0-f52.google.com [209.85.218.52]) by mx2-lw-eu.apache.org (ASF Mail Server at mx2-lw-eu.apache.org) with ESMTPS id 797045F3DA for ; Sat, 16 Apr 2016 05:34:27 +0000 (UTC) Received: by mail-oi0-f52.google.com with SMTP id y204so143179924oie.3 for ; Fri, 15 Apr 2016 22:34:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to; bh=J6JiHi/TNuzPRCO8/6aGN+DfJhrK1oZ1R9zXtZmrBTQ=; b=fesqhmCtV/Jw/uJjMXeCFEEiL/KzT2n5aBlkISjDlyknfk5XNto5vv1HryEcKQ2/6s NMN1cyRS2muGXzufNsUT6JmG54irepZz0zaXE6WG6hwbVs5jTc6X1ld/PuXqdL7OFTLI BAZTjhkV4XeEn6bpbqN1qMAhvR3LiYOaQKxbrou83sNnzjlXynIeVPqqyeEa3SOvBOyX dfIPZacgVZYOJDEHvtUcIcP/Lu7C3dDAaBm87gWJPC8TS9h0Bwq/PCfyt29cBPExkikH Bu9CApiRziWFI5ZxfS6tl55xttJxi92kLw0oafdspLNbiRAxcg7jbyCISL4pKDeCp6Zn 4e6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to; bh=J6JiHi/TNuzPRCO8/6aGN+DfJhrK1oZ1R9zXtZmrBTQ=; b=c9KpYYMtvdv6PdtFm7V6h1/hd7uUz9S1CO+KQTT4ng/kVE1gKU4tK5ro0ksnzAxY4k HGeIaglf20mX07lBTcguEQgxj2naa1dJWyA2XfEkz3rindaJI+7EshWcehp0FkNipBom ftSCbtwrMlznW1/TCa/sgLP15j+tABAIIW7ptImkFyhND+BDUrswjfszSv95RErqDasy 82t/HKe9UzH3/SlbHdzxmeWiFEOYTI2+mgcQkjqT1nLQccvcfJah3NjNq41puMsB++ax ah7xE0C6tT5jI0EfRSVNF2I0KQ2Qkd6u+Hlx0I2SgIKrBdV+JnkoZGNr/HqWtofCXpSu gZAQ== X-Gm-Message-State: AOPr4FXJjkRWrDa56vKjDRJ+tZL1j/i4JR4gATGHffFk6l7qgNu528lgnD1yspPd6jGujw== X-Received: by 10.157.42.165 with SMTP id e34mr1768892otb.64.1460784866327; Fri, 15 Apr 2016 22:34:26 -0700 (PDT) Received: from mikejcarey.local ([2600:8802:4500:fc:179:88ec:1d:b8e9]) by smtp.googlemail.com with ESMTPSA id 5sm7644485otw.19.2016.04.15.22.34.25 for (version=TLSv1/SSLv3 cipher=OTHER); Fri, 15 Apr 2016 22:34:25 -0700 (PDT) Subject: Re: New Asterix REST API design To: dev@asterixdb.incubator.apache.org References: <21E0A234-8BED-4DB5-BCFA-C84A3A5C4B34@gmail.com> <7CCC0E40-4B18-4521-9630-3978D46CF809@apache.org> <57116351.5060102@gmail.com> <995D89E6-829F-46A9-A64E-9F74D9C2946F@apache.org> <73E84B26-785A-4737-B108-76EBF83FE856@gmail.com> <571189D1.3070804@gmail.com> From: Mike Carey Message-ID: <5711CEE0.3000107@gmail.com> Date: Fri, 15 Apr 2016 22:34:24 -0700 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:38.0) Gecko/20100101 Thunderbird/38.7.2 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/alternative; boundary="------------040003090706050202020403" --------------040003090706050202020403 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit True. I was mainly thinking of the web console; usually one will want one format for an app, I totally agree. For the console it might be fun to be able to switch around. I was thinking that either way the total effort put into computing the final (formatted) result would be the same, and also that the per-node investment in computing it would be the same - so that the time complexity would be the same. However, I agree that this would potentially increase the overall retrieval latency since we'd be doing serial just-in-time formatting as the pickups occur. I'm fine either way - I was just thinking maybe things would be easier (from a boundary-finding standpoint) the binary way. (Not sure!) Cheers, Mike On 4/15/16 8:25 PM, Till Westmann wrote: > I think that it’s a trade-off. Either we do the work when the job is > evaluated or when the job is picked up. If we did it on pick-up, we could > pick it up more than once in different formats, but I don't think that > many > applications would need that (the web console might as somebody > sitting in > front of it might want to look at it). The nice thing about the current > solution is, that we can do the serialization easily in parallel and the > pickup can happen sequentially and we don't have to interleave that with > more computation. > > On 15 Apr 2016, at 17:39, Mike Carey wrote: > >> In a more perfect world, the query results would perhaps be persisted >> in binary ADM form still, and would be just-in-time reformatted when >> they are picked up for delivery back to the requester. At least that >> seems like it would be better... No? >> >> On 4/15/16 5:22 PM, Ildar Absalyamov wrote: >>> I agree that the example where CSV is embedded into return JSON >>> looks quirky (and I am not the big fan of it either). >>> I believe the tradeoff here is following: do we want to keep number >>> of API calls just to get the data minimum, or logically separate >>> metadata (like plans, execution time metrics, etc) from the data on >>> the endpoint level. >>> I have tried to address the former case, however left an option to >>> make this logical separation if the user is wiling to do that (via >>> include-results parameter). There is no real way to do it other way >>> around, since the plans, etc are generated before query is scheduled >>> and any results could be returned. >>> >>>> On Apr 15, 2016, at 17:13, Till Westmann wrote: >>>> >>>> Yes, this API is not ideal for "just getting the data". However, >>>> Ildar’s >>>> goal was to separate the data from the HTML and to build an API >>>> that can be >>>> the basis for the Web-interface - and I think that the API looks >>>> good for >>>> that :) >>>> >>>> I'm wondering if an endpoint to get the data should be an option on >>>> this one >>>> or a different endpoint. The reason is, that all of the additional >>>> request >>>> metadata that we can ask for (plan, metrics, warnings, ..) cannot >>>> easily be >>>> returned with such an API. An API that play well with curl might >>>> even put >>>> the format into the URI, e.g.: >>>> >>>> curl http://host:19100/query/csv?statment=select+element+1+as+one; >>>> > one.csv >>>> >>>> Thoughts? Trade-offs? >>>> >>>> Cheers, >>>> Till >>>> >>>> On 15 Apr 2016, at 16:48, Cameron Samak wrote: >>>> >>>>> That hop is exactly what I think should be (optionally) avoidable >>>>> though >>>>> because >>>>> >>>>> >>>>> 1. The user still needs to parse both JSON (to get the URL) >>>>> along with >>>>> the other format (i.e. CSV) >>>>> >>>>> Consider curl {myquery} > myoutput.csv. That's harder with the >>>>> proposed >>>>> API. >>>>> >>>>> 2. It's an unnecessary round trip back to the server (which, >>>>> depending >>>>> on the environment, can be significant esp. with quick queries). >>>>> >>>>> >>>>> Understood for the result distribution + serialization. >>>>> >>>>> >>>>> Cameron >>>>> >>>>> On Fri, Apr 15, 2016 at 4:24 PM, Till Westmann >>>>> wrote: >>>>> >>>>>> I had a misunderstanding that I think I clarified now. I believed >>>>>> that we >>>>>> don’t have the separation into tuples anymore after result >>>>>> distribution and >>>>>> that we only have bytes that we pass to the client. In that case >>>>>> limiting >>>>>> in >>>>>> the HTTP server would have had to choose between >>>>>> a) limiting based on the number of bytes or >>>>>> b) re-establishing tuple boundaries. >>>>>> However, even though result distribution has serialized the >>>>>> tuples to >>>>>> whatever format (ADM, JSON, CSV), we still send frames and so we >>>>>> should be >>>>>> able to separate the tuples (and limit the number that we return). >>>>>> >>>>>> So I think that it should be feasible to add that (feature creep >>>>>> is coming >>>>>> ... :) ) >>>>>> >>>>>> Cheers, >>>>>> Till >>>>>> >>>>>> >>>>>> On 15 Apr 2016, at 14:55, Mike Carey wrote: >>>>>> >>>>>> I read this much more simply: Can we enhance the API, in the >>>>>> case where >>>>>>> you start with a handle and know that the results are ready now, >>>>>>> to fetch >>>>>>> the results in blocks instead of as one giant result? So still >>>>>>> computing >>>>>>> the giant result - just not pushing it all back at once - seems >>>>>>> like it >>>>>>> might help? >>>>>>> >>>>>>> >>>>>>> On 4/15/16 2:48 PM, Till Westmann wrote: >>>>>>> >>>>>>>> Hi Wail, >>>>>>>> >>>>>>>> I’m not completely sure that I understand how to implement the >>>>>>>> idea. If >>>>>>>> we >>>>>>>> do this only in the API, it might be tricky to get the >>>>>>>> boundaries between >>>>>>>> records right (e.g. if we do indentation on the server). >>>>>>>> However, if we >>>>>>>> want >>>>>>>> to push this into the query engine, we need to understand >>>>>>>> enough of the >>>>>>>> query/statements to put the limit clause in. >>>>>>>> Both approaches don't look great to me. >>>>>>>> >>>>>>>> What did you have in mind? >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Till >>>>>>>> >>>>>>>> On 15 Apr 2016, at 13:19, Wail Alkowaileet wrote: >>>>>>>> >>>>>>>> Hi Ildar, >>>>>>>>> I think if there's something I would love to have is getting >>>>>>>>> partial >>>>>>>>> result >>>>>>>>> instead of all result at once. This can be beneficial for result >>>>>>>>> pagination. When I use AsterixDB UI, 50% of the time my tab >>>>>>>>> crashes (I >>>>>>>>> forget to limit the result). >>>>>>>>> >>>>>>>>> Thanks... >>>>>>>>> >>>>>>>>> On Fri, Apr 15, 2016 at 1:23 AM, Ildar Absalyamov < >>>>>>>>> ildar.absalyamov@gmail.com> wrote: >>>>>>>>> >>>>>>>>> Hi Devs, >>>>>>>>>> Recently there have been a number of conversations about the >>>>>>>>>> future of >>>>>>>>>> our >>>>>>>>>> REST (aka HTTP) API. I summarized these discussions in an >>>>>>>>>> outline of >>>>>>>>>> the >>>>>>>>>> new API design: >>>>>>>>>> >>>>>>>>>> https://cwiki.apache.org/confluence/display/ASTERIXDB/New+HTTP+API+Design >>>>>>>>>> >>>>>>>>>> < >>>>>>>>>> https://cwiki.apache.org/confluence/display/ASTERIXDB/New+HTTP+API+Design >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> . >>>>>>>>>>> >>>>>>>>>> The need to refactor existing API came from different >>>>>>>>>> directions (and >>>>>>>>>> from >>>>>>>>>> different people), and is explained in motivation section. >>>>>>>>>> Thus I >>>>>>>>>> believe >>>>>>>>>> it’s about the time to take an effort and improve existing >>>>>>>>>> API, so >>>>>>>>>> that it >>>>>>>>>> will not drag us down in the future. However during the >>>>>>>>>> transition >>>>>>>>>> step I >>>>>>>>>> believe it would be better to keep exiting API endpoints, so >>>>>>>>>> that we >>>>>>>>>> would >>>>>>>>>> not break people’s current experimental setup. >>>>>>>>>> >>>>>>>>>> It would be good to know feedback from the folks, who have been >>>>>>>>>> contributing to that part of the systems recently. >>>>>>>>>> >>>>>>>>>> Best regards, >>>>>>>>>> Ildar >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> *Regards,* >>>>>>>>> Wail Alkowaileet >>>>>>>>> >>> Best regards, >>> Ildar >>> >>> --------------040003090706050202020403--