Return-Path: X-Original-To: apmail-asterixdb-users-archive@minotaur.apache.org Delivered-To: apmail-asterixdb-users-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E00E719744 for ; Thu, 7 Apr 2016 16:40:31 +0000 (UTC) Received: (qmail 50589 invoked by uid 500); 7 Apr 2016 16:40:31 -0000 Delivered-To: apmail-asterixdb-users-archive@asterixdb.apache.org Received: (qmail 50552 invoked by uid 500); 7 Apr 2016 16:40:31 -0000 Mailing-List: contact users-help@asterixdb.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@asterixdb.incubator.apache.org Delivered-To: mailing list users@asterixdb.incubator.apache.org Received: (qmail 50542 invoked by uid 99); 7 Apr 2016 16:40:31 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Apr 2016 16:40:31 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 0AD641A0777 for ; Thu, 7 Apr 2016 16:40:31 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.179 X-Spam-Level: * X-Spam-Status: No, score=1.179 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id RUW5fVOwN67B for ; Thu, 7 Apr 2016 16:40:29 +0000 (UTC) Received: from mail-ob0-f177.google.com (mail-ob0-f177.google.com [209.85.214.177]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with ESMTPS id B9EF75FB15 for ; Thu, 7 Apr 2016 16:40:28 +0000 (UTC) Received: by mail-ob0-f177.google.com with SMTP id j9so56377615obd.3 for ; Thu, 07 Apr 2016 09:40:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=DdMMmqdRx3SNBhC1yuqA6uHGRsaaUUbQiuNUbJR9AOQ=; b=jNLIIyxX5AnMlRHrUNlAG3jzmWiAqjHLYKzA3c9LGfTPaIYG5NREAZer4TMaC82tUU cK3pDldm9B0oTvUBjxhKgzbAfOrp/vbMxZJngddjbuU7AhUo0hgAZUBSczQHDVi8kHvR 7QNCCrnkobB8Km6vQoYYnpSkygzcNoZCkt3Dp5pJ8KUqXEp6AKv1MG5Mm0iofVemizc5 t1ZnlIRkD0RNezdtO8iJnsfzrFWqT29cUO8KkTO8dA0uvsLbrUoq7vbGiKtDZ8+8Bk8w WnA9A1mPnAc75BsaM6RKqJiXtRQVS/n4Cq68PXyMy5wkjY71IMpde7nMWz3olTifA5ab XCOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=DdMMmqdRx3SNBhC1yuqA6uHGRsaaUUbQiuNUbJR9AOQ=; b=CIxF9YWVhhdXDAhZAXFqSecdqcVEnWNG3Q7dFuLwEyeecpd/JNN/+Cxs5cIGQQJOSh x7WIFsnYkH/0FXDvdRb7yMgLVHiKm5f52wcMZM+GVTww07hZ4ID3Rx0yCmDldqFh9i1S dfis6gkUyo5XI937NNV9DIa9PyGDM0+sB0xlBAxwF5VCXRi7Nzs1GZRsk/RD2Ke8PDOm 6sM1dNv5k9xk4L3B7GC7bllh0Y/nXX5hgbmfFvAd/ugYFoiymLv0y+ZJdUKHMc9uXeBv Km+GJUQzM1E9KOSMeY1STNvM8VBpngVepYEwN2Cirjt3ncgxvC0mVonMlPXCdDgKnny9 3dRw== X-Gm-Message-State: AD7BkJJvAjnP29q8UC/4nq6HBq+YRlxzDyOV6wJ4/r1H+CBD1+tvtjwP0Eq9MHrwD6+/ZI5aM1HeqWHQOiUeZA== X-Received: by 10.60.84.98 with SMTP id x2mr2075330oey.23.1460047222649; Thu, 07 Apr 2016 09:40:22 -0700 (PDT) MIME-Version: 1.0 Received: by 10.157.18.168 with HTTP; Thu, 7 Apr 2016 09:40:03 -0700 (PDT) In-Reply-To: References: <57056A4B.9040502@gmail.com> From: Taewoo Kim Date: Thu, 7 Apr 2016 09:40:03 -0700 Message-ID: Subject: Re: Limit clause To: users@asterixdb.incubator.apache.org Content-Type: multipart/alternative; boundary=089e0111b9c20fb853052fe7bcb3 --089e0111b9c20fb853052fe7bcb3 Content-Type: text/plain; charset=UTF-8 What we usually do is wrapping a query using count( ) like the following. use dataverse XXX; count( from $obj in dataset YYY where $obj.timestamp >= 1412121600 and $obj.timestamp <= 1412726400 limit 2 return $obj); Best, Taewoo On Wed, Apr 6, 2016 at 11:54 PM, Magnus Kongshem wrote: > Thanks for the detailed reply, I recommend documenting this! > > What I'm doing is testing different queries to find their execution time, > and if the results from my query contains 12 million objects, the browser > is having trouble displaying the entire list(it is chopped of some where > along the way). So I tested with the limit clause and got the same > execution time (give or take a few seconds). So, is there another way of > performing a query to get the execution time without having to print the > entire result? Or is limit the best workaround for this use case? Recall my > example query below where I use "return $obj;". > > -- > > Mvh > > Magnus Alderslyst Kongshem > +47 415 65 906 > > On Wed, Apr 6, 2016 at 9:58 PM, Mike Carey wrote: > >> (And clearly both optimizations are *really* important for performance - >> both in real use cases and also in benchmarks.) >> >> >> On 4/6/16 8:37 AM, Taewoo Kim wrote: >> >> Hello Magnus, >> >> Right now, the search does not stop until it finds all tuples that >> satisfies the given predicate. However, if you have an ORDER BY clause >> before LIMIT, it can stop sorting process in ORDER BY early. That is, it >> picks top K (in your example, K is 2) and sorting process stops early. >> >> Actually, if the plan qualifies as an index-only plan (the given >> predicate can be covered by a secondary index search and you only return >> secondary key field and/or primary key field), the LIMIT can be applied to >> an index-search so that an index-search can be stopped earlier after >> finding K tuples. This feature is already implemented but in the >> code-review now. It would be applied soon. >> >> Best, >> Taewoo >> >> On Wed, Apr 6, 2016 at 7:22 AM, Magnus Kongshem < >> kongshem@online.ntnu.no> wrote: >> >>> Hey, >>> >>> Performing a query with the limit clause, does this cause the query to >>> stop searching when it reaches the value of the limit parameter(like in >>> SQL), or does it keep searching for every matching object and then return >>> the number of objects specified by the limit? Example below: >>> >>> use dataverse XXX; >>> from $obj in dataset YYY >>> where $obj.timestamp >= 1412121600 >>> and $obj.timestamp <= 1412726400 >>> limit 2 >>> return $obj; >>> >>> Followup questions may occur. >>> >>> -- >>> >>> Mvh >>> >>> Magnus Alderslyst Kongshem >>> +47 415 65 906 <%2B47%20415%2065%20906> >>> >> --089e0111b9c20fb853052fe7bcb3 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
What we usually do is wrapping a query using count( ) like= the following. =C2=A0

use= dataverse XXX;
count(
from $obj in dataset YYY
=C2=A0 =C2=A0 where $obj.timestamp >=3D 1412121600=C2=A0
=C2=A0 =C2=A0 and $obj.timestamp <=3D 141272640= 0
=C2=A0 =C2=A0 limit 2=C2=A0
return $obj);

Best,
=
Taewoo

On Wed, Apr 6, 2016 at 11:54 PM, Magnus Kong= shem <kongshem@stud.ntnu.no> wrote:
Thanks for the detailed reply, I recommend = documenting this!

What I'm doing is testing different queries to= find their execution time, and if the results from my query contains 12 mi= llion objects, the browser is having trouble displaying the entire list(it = is chopped of some where along the way). So I tested with the limit clause = and got the same execution time (give or take a few seconds). So, is there = another way of performing a query to get the execution time without having = to print the entire result? Or is limit the best workaround for this use ca= se? Recall my example query below where I use "return $obj;".

--=C2=A0

Mvh

Magnus Alderslyst Kongshem
+47 415 65 906


On Wed= , Apr 6, 2016 at 9:58 PM, Mike Carey <dtabass@gmail.com> wro= te:
=20 =20 =20
(And clearly both optimizations are *really* importan= t for performance - both in real use cases and also in benchmarks.)


On 4/6/16 8:37 AM, Taewoo Kim wrote:
Hello Magnus,

Right now, the search does not stop until it finds all tuples that satisfies the given predicate. However, if you have an ORDER BY clause before LIMIT, it can stop sorting process in ORDER BY early. That is, it picks top K (in your example, K is 2) and sorting process stops early.=C2=A0

Actually, if the plan qualifies as an index-only plan (the given predicate can be covered by a secondary index search and you only return secondary key field and/or primary key field), the LIMIT can be applied to an index-search so that an index-search can be stopped earlier after finding K tuples. This feature is already implemented but in the code-review now. It would be applied soon. =C2=A0=C2=A0

Best,
Taewoo

On Wed, Apr 6, 2016 at 7:22 AM, Magnus Kongshem <kongshem@online.ntnu.no> wrote:
Hey,

Performing a query with the limit clause, does this cause the query to stop searching when it reaches the value of the limit parameter(like in SQL), or does it keep searching for every matching object and then return the number of objects specified by the limit? Example below:

use dataverse XXX;
from $obj in dataset YYY
=C2=A0 =C2=A0 where $obj.timestamp >=3D 141212160= 0=C2=A0
=C2=A0 =C2=A0 and $obj.timestamp <=3D 1412726400<= /div>
=C2=A0 =C2=A0 limit 2=C2=A0
return $obj;

Followup questions may occur.

--

Mvh

Magnus Alderslyst Kongshem
+47 415 65 906


--089e0111b9c20fb853052fe7bcb3--