asterixdb-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Till Westmann (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ASTERIXDB-1418) Doesn't support a Nested Aggregation Query
Date Thu, 28 Apr 2016 00:20:12 GMT

    [ https://issues.apache.org/jira/browse/ASTERIXDB-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15261277#comment-15261277
] 

Till Westmann commented on ASTERIXDB-1418:
------------------------------------------

Sorry, I didn't mean to suggest that something might have changed in the last 3 days. 
Just wondering if this was on master or potentially a further diverged development branch.


> Doesn't support a Nested Aggregation Query
> ------------------------------------------
>
>                 Key: ASTERIXDB-1418
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1418
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: AsterixDB, Optimizer
>            Reporter: Jianfeng Jia
>            Assignee: Yingyi Bu
>
> When I ran the following query
> {code}
> use dataverse twitter
> for $t in dataset ds_tweet_trump
> group by
>   $county := $t.geo_tag.countyID,
>   $timebin := interval-bin($t.create_at, date("2012-01-01"), day-time-duration("P1D"))
with $t
> return {
>   "county": $county,
>   "time": $timebin,
>   "count": count($t),
>   "users": count( for $tt in $t distinct by $tt.user.id return $tt.user.id)
>   }
> {code}
> One exception appears:
> {code}
> Attempting to construct a nested plan with 3 operator descriptors. Currently, nested
plans can only consist in linear pipelines of Asterix micro operators. [AlgebricksException]
> {code}
> The ddl :
> {code}
> create dataverse twitter if not exists;
> use dataverse twitter
> create type typeUser if not exists as open {
>     id: int64,
>     name: string,
>     screen_name : string,
>     lang : string,
>     location: string,
>     create_at: date,
>     description: string,
>     followers_count: int32,
>     friends_count: int32,
>     statues_count: int64
> }
> create type typePlace if not exists as open{
>     country : string,
>     country_code : string,
>     full_name : string,
>     id : string,
>     name : string,
>     place_type : string,
>     bounding_box : rectangle
> }
> create type typeGeoTag if not exists as open {
>     stateID: int32,
>     stateName: string,
>     countyID: int32,
>     countyName: string,
>     cityID: int32?,
>     cityName: string?
> }
> create type typeTweet if not exists as open{
>     create_at : datetime,
>     id: int64,
>     "text": string,
>     in_reply_to_status : int64,
>     in_reply_to_user : int64,
>     favorite_count : int64,
>     coordinate: point?,
>     retweet_count : int64,
>     lang : string,
>     is_retweet: boolean,
>     hashtags : {{ string }} ?,
>     user_mentions : {{ int64 }} ? ,
>     user : typeUser,
>     place : typePlace?,
>     geo_tag: typeGeoTag
> }
> create dataset ds_tweet(typeTweet) if not exists primary key id;
> //with filter on create_at;
> {code}
> The logical plan is generated successfully:
> {code}
> distribute result [%0->$$13]
> -- DISTRIBUTE_RESULT  |PARTITIONED|
>   exchange 
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
>     project ([$$13])
>     -- STREAM_PROJECT  |PARTITIONED|
>       assign [$$13] <- [function-call: asterix:closed-record-constructor, Args:[AString:
{county}, %0->$$1, AString: {time}, %0->$$2, AString: {count}, %0->$$25, AString:
{users}, %0->$$26]]
>       -- ASSIGN  |PARTITIONED|
>         exchange 
>         -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
>           group by ([$$1 := %0->$$32; $$2 := %0->$$33]) decor ([]) {
>                     aggregate [$$25] <- [function-call: asterix:agg-sum, Args:[%0->$$30]]
>                     -- AGGREGATE  |LOCAL|
>                       nested tuple source
>                       -- NESTED_TUPLE_SOURCE  |LOCAL|
>                  }
>                  {
>                     aggregate [$$26] <- [function-call: asterix:agg-sum, Args:[%0->$$31]]
>                     -- AGGREGATE  |LOCAL|
>                       nested tuple source
>                       -- NESTED_TUPLE_SOURCE  |LOCAL|
>                  }
>           -- PRE_CLUSTERED_GROUP_BY[$$32, $$33]  |PARTITIONED|
>             exchange 
>             -- HASH_PARTITION_MERGE_EXCHANGE MERGE:[$$32(ASC), $$33(ASC)] HASH:[$$32,
$$33]  |PARTITIONED|
>               group by ([$$32 := %0->$$21; $$33 := %0->$$22]) decor ([]) {
>                         aggregate [$$30] <- [function-call: asterix:agg-count, Args:[%0->$$3]]
>                         -- AGGREGATE  |LOCAL|
>                           nested tuple source
>                           -- NESTED_TUPLE_SOURCE  |LOCAL|
>                      }
>                      {
>                         aggregate [$$31] <- [function-call: asterix:agg-count, Args:[%0->$$23]]
>                         -- AGGREGATE  |LOCAL|
>                           exchange 
>                           -- ONE_TO_ONE_EXCHANGE  |LOCAL|
>                             distinct ([%0->$$23])
>                             -- PRE_SORTED_DISTINCT_BY  |LOCAL|
>                               exchange 
>                               -- ONE_TO_ONE_EXCHANGE  |LOCAL|
>                                 order (ASC, %0->$$23) 
>                                 -- IN_MEMORY_STABLE_SORT [$$23(ASC)]  |LOCAL|
>                                   nested tuple source
>                                   -- NESTED_TUPLE_SOURCE  |LOCAL|
>                      }
>               -- PRE_CLUSTERED_GROUP_BY[$$21, $$22]  |PARTITIONED|
>                 exchange 
>                 -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
>                   order (ASC, %0->$$21) (ASC, %0->$$22) 
>                   -- STABLE_SORT [$$21(ASC), $$22(ASC)]  |PARTITIONED|
>                     exchange 
>                     -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
>                       assign [$$22, $$21, $$23] <- [function-call: asterix:interval-bin,
Args:[function-call: asterix:field-access-by-index, Args:[%0->$$3, AInt32: {0}], ADate:
{ 2012-01-01 }, org.apache.asterix.om.base.ADayTimeDuration@5265c00], function-call: asterix:field-access-by-index,
Args:[function-call: asterix:field-access-by-index, Args:[%0->$$3, AInt32: {14}], AInt32:
{2}], function-call: asterix:field-access-by-index, Args:[function-call: asterix:field-access-by-index,
Args:[%0->$$3, AInt32: {12}], AInt32: {0}]]
>                       -- ASSIGN  |PARTITIONED|
>                         project ([$$3])
>                         -- STREAM_PROJECT  |PARTITIONED|
>                           exchange 
>                           -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
>                             data-scan []<-[$$24, $$3] <- twitter:ds_tweet
>                             -- DATASOURCE_SCAN  |PARTITIONED|
>                               exchange 
>                               -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
>                                 empty-tuple-source
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message