asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yingyi Bu <buyin...@gmail.com>
Subject Re: Does Projection affect count() performance?
Date Sat, 01 Oct 2016 17:14:18 GMT
PS, if you still have the OOM instance, can you do a Yourkit memory profile?
Thanks!

Best,
Yingyi

On Sat, Oct 1, 2016 at 9:43 AM, Yingyi Bu <buyingyi@gmail.com> wrote:

> Wail,
>
>       Can you attach the query plan for query 1?
>       I tried
>        count( for $x in dataset beers
>          return $x
>        )
>
>       and got the following plan, which seems OK:
> -- DISTRIBUTE_RESULT  |UNPARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |UNPARTITIONED|
>     aggregate [$$5] <- [function-call: asterix:agg-sum, Args:[%0->$$8]]
>     -- AGGREGATE  |UNPARTITIONED|
>       exchange
>       -- RANDOM_MERGE_EXCHANGE  |PARTITIONED|
>         aggregate [$$8] <- [function-call: asterix:agg-count,
> Args:[%0->$$0]]
>         -- AGGREGATE  |PARTITIONED|
>           project ([$$0])
>           -- STREAM_PROJECT  |PARTITIONED|
>             exchange
>             -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
>               data-scan []<-[$$6, $$0, $$7] <- Default:beers
>               -- DATASOURCE_SCAN  |PARTITIONED|
>                 exchange
>                 -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
>                   empty-tuple-source
>                   -- EMPTY_TUPLE_SOURCE  |PARTITIONED|
>
>
> Best,
> Yingyi
>
> On Sat, Oct 1, 2016 at 9:25 AM, Mike Carey <dtabass@gmail.com> wrote:
>
>> Sounds like there is a new materialization bug there.....  Please file a
>> JIRA issue (and we'll need a query plan test case to keep it from breaking
>> again).
>>
>>
>>
>> On 10/1/16 2:01 AM, Wail Alkowaileet wrote:
>>
>>> Hi,
>>>
>>> I know that early projections will enhance the performance.
>>> I just noticed something:
>>>
>>> 1- returning the whole tuple
>>> count( for $x in dataset Tweets
>>> return $x
>>> )
>>>
>>> => Throws an exception Java heap exceeded. (The heap-size is less than
>>> the
>>> sum of AsterixDB configured memory ... so it's not a problem).
>>>
>>> 2- However, returning one field
>>> count( for $x in dataset Tweets
>>> return $x.id
>>> )
>>>
>>> => Worked just fine.
>>>
>>> I'm just wondering, does the projection in count() affects its
>>> performance ?
>>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message