asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yingyi Bu <buyin...@gmail.com>
Subject Re: Creating aggregate functions
Date Sun, 23 Jul 2017 17:29:43 GMT
Sorry, a typo:

AVG:  that's the logical function in the logical plan.

On Sun, Jul 23, 2017 at 10:29 AM, Yingyi Bu <buyingyi@gmail.com> wrote:

> >> I see AVG, LOCAL_AVG, INTERMEDIATE_AVG and GLOBAL_AVG.
>
> AVG:  that's the local function in the local plan.
> LOCAL_AVG, INTERMEDIATE_AVG and GLOBAL_AVG:   think about distributed
> computation of average.  LOCAL_AVG aggregates the sum/count at the local
> data source, INTERMEDIATE_AVG aggregates the sum/count over partially
> aggregated sums/counts, and GLOBAL_AVG computes the final average value
> from intermediate sums/counts.
>
> Best,
> Yingyi
>
>
> On Sat, Jul 22, 2017 at 9:43 PM, Riyafa Abdul Hameed <
> riyafa.12@cse.mrt.ac.lk> wrote:
>
>> Hi,
>>
>> Thanks for the explanation.
>> But there are so many things I still don't understand. One of them is for
>> the avg function itself there are several FuntionIdentifiers. What do they
>> all mean?
>>
>> I see AVG, LOCAL_AVG, INTERMEDIATE_AVG and GLOBAL_AVG.
>>
>> What do they all mean?
>> Please help
>>
>> On 19 July 2017 at 21:56, Yingyi Bu <buyingyi@gmail.com> wrote:
>>
>> > Hi Riyafa,
>> >
>> >    >> ScalarCountAggregateDescriptor
>> >   It's used for counting a scalar array that appears inside a tuple.
>> >   For example:
>> >   SELECT u.id, array_count(u.friends)
>> >   FROM users u;
>> >
>> >    >> SerializableCountAggregateDescriptor
>> >    Serialized aggregation descriptor implementations are only used in
>> > hash-based group-by.
>> >    For example:
>> >    SELECT u.city, count(*)
>> >    FROM users u
>> >    /*+ hash */
>> >    GROUP BY u.city;
>> >
>> >   If your aggregation function doesn't have a fixed-byte-sized state,
>> you
>> > don't need to worry about that or implement that.
>> >
>> >    >> CountAggregateDescriptor
>> >    This is used in group-by or global aggregate:
>> >    For example:
>> >    SELECT u.city, count(*)
>> >    FROM users u
>> >    GROUP BY u.city;
>> >
>> >    SELECT count(*) FROM users;
>> >
>> >
>> > Best,
>> > Yingyi
>> >
>> >
>> > On Wed, Jul 19, 2017 at 7:55 AM, Riyafa Abdul Hameed <riyafa@apache.org
>> >
>> > wrote:
>> >
>> > > Hi again,
>> > >
>> > > Any suggestions on this? Or anyone I can reach to who are not on this
>> > list
>> > > or not active on the list?
>> > >
>> > > Thank you.
>> > >
>> > > On 17 July 2017 at 17:18, Riyafa Abdul Hameed <riyafa@apache.org>
>> wrote:
>> > >
>> > > > Hi again,
>> > > >
>> > > > I think I can understand how to write the descriptor in the
>> packages:
>> > > > org.apache.asterix.runtime.aggregates.std and
>> > > org.apache.asterix.runtime.aggregates.scalar.
>> > > > But I am not sure I understand how to write the descriptor in the
>> > > package:
>> > > > org.apache.asterix.runtime.aggregates.serializable.std  because it
>> > > > requires setting a state in the init function that doesn't seem to
>> > have a
>> > > > pattern in the other descriptors.
>> > > > Also I don't seem to understand the reasons for implementing each
of
>> > > these
>> > > > descriptors for the aggregate functions.
>> > > >
>> > > > On 17 July 2017 at 16:56, Riyafa Abdul Hameed <
>> riyafa.12@cse.mrt.ac.lk
>> > >
>> > > > wrote:
>> > > >
>> > > >> Hi all,
>> > > >>
>> > > >> I meant any explanation on the implementation of aggregate
>> functions
>> > in
>> > > >> AsterixDB would be highly appreciated.
>> > > >>
>> > > >> Thank you.
>> > > >> Yours sincerely,
>> > > >> Riyafa
>> > > >>
>> > > >> On 16 July 2017 at 08:01, Riyafa Abdul Hameed <riyafa@apache.org>
>> > > wrote:
>> > > >>
>> > > >>> Dear all,
>> > > >>>
>> > > >>> I am trying to create aggregate functions and I see there
are more
>> > than
>> > > >>> one function descriptors for one single function.
>> > > >>> For example the function array_count(collection) has the following
>> > > >>> descriptors:
>> > > >>>
>> > > >>>
>> > > >>>    - ScalarCountAggregateDescriptor
>> > > >>>    - SerializableCountAggregateDescriptor
>> > > >>>    - CountAggregateDescriptor
>> > > >>>
>> > > >>> I am not sure I understand the difference between each of
this.
>> Can
>> > you
>> > > >>> please provide and example or point me to a documentation
entry to
>> > > learn
>> > > >>> how to properly implement aggregate functions?
>> > > >>>
>> > > >>> The function I am trying to implement is ST_Extent.
>> > > >>> <https://postgis.net/docs/manual-1.4/ST_Extent.html>
>> > > >>>
>> > > >>> Thank you.
>> > > >>>
>> > > >>> Yours sincerely,
>> > > >>>
>> > > >>> Riyafa
>> > > >>>
>> > > >>
>> > > >>
>> > > >>
>> > > >> --
>> > > >> Riyafa Abdul Hameed
>> > > >> Undergraduate, University of Moratuwa
>> > > >>
>> > > >> Email: riyafa.12@cse.mrt.ac.lk
>> > > >> Website: https://riyafa.wordpress.com/ <
>> http://riyafa.wordpress.com/>
>> > > >> <http://facebook.com/riyafa.ahf>  <http://lk.linkedin.com/in/riy
>> afa>
>> > > >> <http://twitter.com/Riyafa1>
>> > > >>
>> > > >
>> > > >
>> > >
>> >
>>
>>
>>
>> --
>> Riyafa Abdul Hameed
>> Undergraduate, University of Moratuwa
>>
>> Email: riyafa.12@cse.mrt.ac.lk
>> Website: https://riyafa.wordpress.com/ <http://riyafa.wordpress.com/>
>> <http://facebook.com/riyafa.ahf>  <http://lk.linkedin.com/in/riyafa>
>> <http://twitter.com/Riyafa1>
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message