asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wail Alkowaileet <wael....@gmail.com>
Subject Re: Does Projection affect count() performance?
Date Mon, 03 Oct 2016 09:23:41 GMT
It's fine...
I will reproduce the problem with Yourkit memory snapshot and post it.

Thanks!



On Mon, Oct 3, 2016 at 11:47 AM, Khurram Faraaz <khfaraaz82@gmail.com>
wrote:

> Wail - I went ahead and filed ASTERIXDB-1670
> <https://issues.apache.org/jira/browse/ASTERIXDB-1670> for you, I tried to
> change the reporter to your name, but I don't have permissions to edit the
> reporter field.
>
> Thanks,
> Khurram
>
> On Sat, Oct 1, 2016 at 10:44 PM, Yingyi Bu <buyingyi@gmail.com> wrote:
>
> > PS, if you still have the OOM instance, can you do a Yourkit memory
> > profile?
> > Thanks!
> >
> > Best,
> > Yingyi
> >
> > On Sat, Oct 1, 2016 at 9:43 AM, Yingyi Bu <buyingyi@gmail.com> wrote:
> >
> > > Wail,
> > >
> > >       Can you attach the query plan for query 1?
> > >       I tried
> > >        count( for $x in dataset beers
> > >          return $x
> > >        )
> > >
> > >       and got the following plan, which seems OK:
> > > -- DISTRIBUTE_RESULT  |UNPARTITIONED|
> > >   exchange
> > >   -- ONE_TO_ONE_EXCHANGE  |UNPARTITIONED|
> > >     aggregate [$$5] <- [function-call: asterix:agg-sum, Args:[%0->$$8]]
> > >     -- AGGREGATE  |UNPARTITIONED|
> > >       exchange
> > >       -- RANDOM_MERGE_EXCHANGE  |PARTITIONED|
> > >         aggregate [$$8] <- [function-call: asterix:agg-count,
> > > Args:[%0->$$0]]
> > >         -- AGGREGATE  |PARTITIONED|
> > >           project ([$$0])
> > >           -- STREAM_PROJECT  |PARTITIONED|
> > >             exchange
> > >             -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> > >               data-scan []<-[$$6, $$0, $$7] <- Default:beers
> > >               -- DATASOURCE_SCAN  |PARTITIONED|
> > >                 exchange
> > >                 -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> > >                   empty-tuple-source
> > >                   -- EMPTY_TUPLE_SOURCE  |PARTITIONED|
> > >
> > >
> > > Best,
> > > Yingyi
> > >
> > > On Sat, Oct 1, 2016 at 9:25 AM, Mike Carey <dtabass@gmail.com> wrote:
> > >
> > >> Sounds like there is a new materialization bug there.....  Please
> file a
> > >> JIRA issue (and we'll need a query plan test case to keep it from
> > breaking
> > >> again).
> > >>
> > >>
> > >>
> > >> On 10/1/16 2:01 AM, Wail Alkowaileet wrote:
> > >>
> > >>> Hi,
> > >>>
> > >>> I know that early projections will enhance the performance.
> > >>> I just noticed something:
> > >>>
> > >>> 1- returning the whole tuple
> > >>> count( for $x in dataset Tweets
> > >>> return $x
> > >>> )
> > >>>
> > >>> => Throws an exception Java heap exceeded. (The heap-size is less
> than
> > >>> the
> > >>> sum of AsterixDB configured memory ... so it's not a problem).
> > >>>
> > >>> 2- However, returning one field
> > >>> count( for $x in dataset Tweets
> > >>> return $x.id
> > >>> )
> > >>>
> > >>> => Worked just fine.
> > >>>
> > >>> I'm just wondering, does the projection in count() affects its
> > >>> performance ?
> > >>>
> > >>
> > >>
> > >
> >
>



-- 

*Regards,*
Wail Alkowaileet

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message