Return-Path: X-Original-To: apmail-asterixdb-dev-archive@minotaur.apache.org Delivered-To: apmail-asterixdb-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C9C8D18371 for ; Tue, 10 Nov 2015 22:50:13 +0000 (UTC) Received: (qmail 79973 invoked by uid 500); 10 Nov 2015 22:50:13 -0000 Delivered-To: apmail-asterixdb-dev-archive@asterixdb.apache.org Received: (qmail 79927 invoked by uid 500); 10 Nov 2015 22:50:13 -0000 Mailing-List: contact dev-help@asterixdb.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@asterixdb.incubator.apache.org Delivered-To: mailing list dev@asterixdb.incubator.apache.org Received: (qmail 79799 invoked by uid 99); 10 Nov 2015 22:50:13 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Nov 2015 22:50:13 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id C0E1BCCA42; Tue, 10 Nov 2015 22:50:12 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.899 X-Spam-Level: ** X-Spam-Status: No, score=2.899 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id R_bsjUK52xFS; Tue, 10 Nov 2015 22:50:06 +0000 (UTC) Received: from mail-yk0-f175.google.com (mail-yk0-f175.google.com [209.85.160.175]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id D879F20EA4; Tue, 10 Nov 2015 22:50:05 +0000 (UTC) Received: by ykfs79 with SMTP id s79so21232205ykf.1; Tue, 10 Nov 2015 14:50:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=Q0VxKxupBTRFRZ7HWo2j/ay+Vjlh5iJzhnfcX5xzeyk=; b=QAENBR2r+aHqgUmmWtJ71xMK557XQE5s7alp93uCJjFSSaVrYswPqVdeBQReN6Lib8 DvK96OYMt/Ov0iaVStotlgOAho/pAiOxnR7k4ot24TgUMBpCurhb2tTthNUerKXVEyjA RknLE3ae8YdvAgpyYUadx4TfZY60cfWyndsy7bZJdlgGG6Gp5uAgYUUCfPVMqNgCUpkq XPmzJuKDkSTYxaSVKW9kILNidX37HekW3JlgP2sElkfdBkKKBciWFtt8q/gZqhUHiDU8 rCMJxFQPtFhzJsU9SYtSD1blJgAtch1hhtqfdAiPg2hB2D+njt9f4LaJqcvf2b1DPZOf SeiQ== MIME-Version: 1.0 X-Received: by 10.13.233.130 with SMTP id s124mr6313226ywe.18.1447195805107; Tue, 10 Nov 2015 14:50:05 -0800 (PST) Received: by 10.37.25.3 with HTTP; Tue, 10 Nov 2015 14:50:05 -0800 (PST) In-Reply-To: References: Date: Tue, 10 Nov 2015 14:50:05 -0800 Message-ID: Subject: Re: [jira] [Created] (ASTERIXDB-1168) Should not sort&group after an OrderedList left-join with a dataset From: Yingyi Bu To: dev@asterixdb.incubator.apache.org Cc: notifications@asterixdb.incubator.apache.org Content-Type: multipart/alternative; boundary=94eb2c072010e24e8505243787cf --94eb2c072010e24e8505243787cf Content-Type: text/plain; charset=UTF-8 Jianfeng, The results of the query is correct. The cardinality of returned results should be the same as the number of input binding tuples for $p. Best, Yingyi On Tue, Nov 10, 2015 at 2:34 PM, Jianfeng Jia (JIRA) wrote: > Jianfeng Jia created ASTERIXDB-1168: > --------------------------------------- > > Summary: Should not sort&group after an OrderedList left-join > with a dataset > Key: ASTERIXDB-1168 > URL: https://issues.apache.org/jira/browse/ASTERIXDB-1168 > Project: Apache AsterixDB > Issue Type: Bug > Components: Optimizer > Reporter: Jianfeng Jia > > > Hi, > Here is the context for this issue, I wanted to lookup some records in > the DB through REST API, and I wanted to lookup in a batch way. Then I > packaged the "keys" into an OrderdList and expected a left-out join would > give me all matching records that consistent with query order. However, the > result was re-sorted and grouped, which confused the client side response > handler. > > Here is the synthetic query that emulates the similar use case: > --------------------------------------------------------------------------- > drop dataverse test if exists; > create dataverse test; > > use dataverse test; > > create type TType as closed { > id: int64, > content: string > } > > create dataset TData (TType) primary key id; > > insert into dataset TData ( [ {"id":1, "content":"a"}, {"id":2, "content": > "b"}, {"id":3, "content":"c"}]) > > // now let's query on > let $ps := ["b","a", "b","c","c"] > > for $p in $ps > return { "p":$p, > "match": for $x in dataset TData where $x.content = $p return $x.id > } > --------------------------------------------------------------------------- > > What I expected is following: > --------------------------------------------------------------------------- > [ { "p": "b", "match": [ 2 ] } > , { "p": "a", "match": [ 1 ] } > , { "p": "b", "match": [ 2 ] } > , { "p": "c", "match": [ 3 ] } > , { "p": "c", "match": [ 3 ] } > ] > --------------------------------------------------------------------------- > > The returned result is following, which is aggregated and re-sorted. > --------------------------------------------------------------------------- > [ { "p": "a", "match": [ 1 ] } > , { "p": "b", "match": [ 2, 2 ] } > , { "p": "c", "match": [ 3, 3 ] } > ] > --------------------------------------------------------------------------- > > The optimized logical plan is following: > --------------------------------------------------------------------------- > distribute result [%0->$$4] > -- DISTRIBUTE_RESULT |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > project ([$$4]) > -- STREAM_PROJECT |PARTITIONED| > assign [$$4] <- [function-call: asterix:closed-record-constructor, > Args:[AString: {p}, %0->$$1, AString: {match}, %0->$$9]] > -- ASSIGN |PARTITIONED| > project ([$$1, $$9]) > -- STREAM_PROJECT |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > group by ([$$0 := %0->$$12; $$1 := %0->$$13]) decor ([]) { > aggregate [$$9] <- [function-call: asterix:listify, > Args:[%0->$$10]] > -- AGGREGATE |LOCAL| > select (function-call: algebricks:not, > Args:[function-call: algebricks:is-null, Args:[%0->$$11]]) > -- STREAM_SELECT |LOCAL| > nested tuple source > -- NESTED_TUPLE_SOURCE |LOCAL| > } > -- PRE_CLUSTERED_GROUP_BY[$$12, $$13] |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > order (ASC, %0->$$12) (ASC, %0->$$13) > -- STABLE_SORT [$$12(ASC), $$13(ASC)] |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > project ([$$10, $$11, $$12, $$13]) > -- STREAM_PROJECT |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > left outer join (function-call: algebricks:eq, > Args:[%0->$$14, %0->$$13]) > -- HYBRID_HASH_JOIN [$$13][$$14] |PARTITIONED| > exchange > -- HASH_PARTITION_EXCHANGE [$$13] |PARTITIONED| > unnest $$13 <- function-call: > asterix:scan-collection, Args:[%0->$$12] > -- UNNEST |UNPARTITIONED| > assign [$$12] <- [AOrderedList: [ AString: > {b}, AString: {a}, AString: {b}, AString: {c}, AString: {c} ]] > -- ASSIGN |UNPARTITIONED| > empty-tuple-source > -- EMPTY_TUPLE_SOURCE |UNPARTITIONED| > exchange > -- HASH_PARTITION_EXCHANGE [$$14] |PARTITIONED| > project ([$$10, $$11, $$14]) > -- STREAM_PROJECT |PARTITIONED| > assign [$$11, $$14] <- [TRUE, function-call: > asterix:field-access-by-index, Args:[%0->$$2, AInt32: {1}]] > -- ASSIGN |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > data-scan []<-[$$10, $$2] <- test:TData > -- DATASOURCE_SCAN |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > empty-tuple-source > -- EMPTY_TUPLE_SOURCE > > --------------------------------------------------------------------------------- > > Why there is an STABLE_SORT + PRE_CLUSTERED_GROUP_BY after the left out > join? > We can close this issue if this is an intended design. > > > > > -- > This message was sent by Atlassian JIRA > (v6.3.4#6332) > --94eb2c072010e24e8505243787cf--