asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yingyi Bu <buyin...@gmail.com>
Subject Re: Creating aggregate functions
Date Sun, 23 Jul 2017 17:29:24 GMT
>> I see AVG, LOCAL_AVG, INTERMEDIATE_AVG and GLOBAL_AVG.

AVG:  that's the local function in the local plan.
LOCAL_AVG, INTERMEDIATE_AVG and GLOBAL_AVG:   think about distributed
computation of average.  LOCAL_AVG aggregates the sum/count at the local
data source, INTERMEDIATE_AVG aggregates the sum/count over partially
aggregated sums/counts, and GLOBAL_AVG computes the final average value
from intermediate sums/counts.

Best,
Yingyi


On Sat, Jul 22, 2017 at 9:43 PM, Riyafa Abdul Hameed <
riyafa.12@cse.mrt.ac.lk> wrote:

> Hi,
>
> Thanks for the explanation.
> But there are so many things I still don't understand. One of them is for
> the avg function itself there are several FuntionIdentifiers. What do they
> all mean?
>
> I see AVG, LOCAL_AVG, INTERMEDIATE_AVG and GLOBAL_AVG.
>
> What do they all mean?
> Please help
>
> On 19 July 2017 at 21:56, Yingyi Bu <buyingyi@gmail.com> wrote:
>
> > Hi Riyafa,
> >
> >    >> ScalarCountAggregateDescriptor
> >   It's used for counting a scalar array that appears inside a tuple.
> >   For example:
> >   SELECT u.id, array_count(u.friends)
> >   FROM users u;
> >
> >    >> SerializableCountAggregateDescriptor
> >    Serialized aggregation descriptor implementations are only used in
> > hash-based group-by.
> >    For example:
> >    SELECT u.city, count(*)
> >    FROM users u
> >    /*+ hash */
> >    GROUP BY u.city;
> >
> >   If your aggregation function doesn't have a fixed-byte-sized state, you
> > don't need to worry about that or implement that.
> >
> >    >> CountAggregateDescriptor
> >    This is used in group-by or global aggregate:
> >    For example:
> >    SELECT u.city, count(*)
> >    FROM users u
> >    GROUP BY u.city;
> >
> >    SELECT count(*) FROM users;
> >
> >
> > Best,
> > Yingyi
> >
> >
> > On Wed, Jul 19, 2017 at 7:55 AM, Riyafa Abdul Hameed <riyafa@apache.org>
> > wrote:
> >
> > > Hi again,
> > >
> > > Any suggestions on this? Or anyone I can reach to who are not on this
> > list
> > > or not active on the list?
> > >
> > > Thank you.
> > >
> > > On 17 July 2017 at 17:18, Riyafa Abdul Hameed <riyafa@apache.org>
> wrote:
> > >
> > > > Hi again,
> > > >
> > > > I think I can understand how to write the descriptor in the packages:
> > > > org.apache.asterix.runtime.aggregates.std and
> > > org.apache.asterix.runtime.aggregates.scalar.
> > > > But I am not sure I understand how to write the descriptor in the
> > > package:
> > > > org.apache.asterix.runtime.aggregates.serializable.std  because it
> > > > requires setting a state in the init function that doesn't seem to
> > have a
> > > > pattern in the other descriptors.
> > > > Also I don't seem to understand the reasons for implementing each of
> > > these
> > > > descriptors for the aggregate functions.
> > > >
> > > > On 17 July 2017 at 16:56, Riyafa Abdul Hameed <
> riyafa.12@cse.mrt.ac.lk
> > >
> > > > wrote:
> > > >
> > > >> Hi all,
> > > >>
> > > >> I meant any explanation on the implementation of aggregate functions
> > in
> > > >> AsterixDB would be highly appreciated.
> > > >>
> > > >> Thank you.
> > > >> Yours sincerely,
> > > >> Riyafa
> > > >>
> > > >> On 16 July 2017 at 08:01, Riyafa Abdul Hameed <riyafa@apache.org>
> > > wrote:
> > > >>
> > > >>> Dear all,
> > > >>>
> > > >>> I am trying to create aggregate functions and I see there are
more
> > than
> > > >>> one function descriptors for one single function.
> > > >>> For example the function array_count(collection) has the following
> > > >>> descriptors:
> > > >>>
> > > >>>
> > > >>>    - ScalarCountAggregateDescriptor
> > > >>>    - SerializableCountAggregateDescriptor
> > > >>>    - CountAggregateDescriptor
> > > >>>
> > > >>> I am not sure I understand the difference between each of this.
Can
> > you
> > > >>> please provide and example or point me to a documentation entry
to
> > > learn
> > > >>> how to properly implement aggregate functions?
> > > >>>
> > > >>> The function I am trying to implement is ST_Extent.
> > > >>> <https://postgis.net/docs/manual-1.4/ST_Extent.html>
> > > >>>
> > > >>> Thank you.
> > > >>>
> > > >>> Yours sincerely,
> > > >>>
> > > >>> Riyafa
> > > >>>
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> Riyafa Abdul Hameed
> > > >> Undergraduate, University of Moratuwa
> > > >>
> > > >> Email: riyafa.12@cse.mrt.ac.lk
> > > >> Website: https://riyafa.wordpress.com/ <
> http://riyafa.wordpress.com/>
> > > >> <http://facebook.com/riyafa.ahf>  <http://lk.linkedin.com/in/riyafa
> >
> > > >> <http://twitter.com/Riyafa1>
> > > >>
> > > >
> > > >
> > >
> >
>
>
>
> --
> Riyafa Abdul Hameed
> Undergraduate, University of Moratuwa
>
> Email: riyafa.12@cse.mrt.ac.lk
> Website: https://riyafa.wordpress.com/ <http://riyafa.wordpress.com/>
> <http://facebook.com/riyafa.ahf>  <http://lk.linkedin.com/in/riyafa>
> <http://twitter.com/Riyafa1>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message