asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Riyafa Abdul Hameed <riyafa...@cse.mrt.ac.lk>
Subject Re: Creating aggregate functions
Date Mon, 24 Jul 2017 14:57:08 GMT
Hi,

Does the creation of aggregate functions in AsterixDB based on some
programming model like mapreduce? If so can you please suggest links to
learn this so that I could understand better. I still do not get the
overall picture on the creation of aggregate functions (It might also be
because creation of normal functions is pretty straightforward as far as I
am concerned).

I started on the implementation here[1] and am stuck there. I will try
again and update this commit.

[1]
https://github.com/riyafa/asterixdb/commit/dc437ddcc0ac175b20120047facca337e431fa92

On 23 July 2017 at 22:59, Yingyi Bu <buyingyi@gmail.com> wrote:

> Sorry, a typo:
>
> AVG:  that's the logical function in the logical plan.
>
> On Sun, Jul 23, 2017 at 10:29 AM, Yingyi Bu <buyingyi@gmail.com> wrote:
>
> > >> I see AVG, LOCAL_AVG, INTERMEDIATE_AVG and GLOBAL_AVG.
> >
> > AVG:  that's the local function in the local plan.
> > LOCAL_AVG, INTERMEDIATE_AVG and GLOBAL_AVG:   think about distributed
> > computation of average.  LOCAL_AVG aggregates the sum/count at the local
> > data source, INTERMEDIATE_AVG aggregates the sum/count over partially
> > aggregated sums/counts, and GLOBAL_AVG computes the final average value
> > from intermediate sums/counts.
> >
> > Best,
> > Yingyi
> >
> >
> > On Sat, Jul 22, 2017 at 9:43 PM, Riyafa Abdul Hameed <
> > riyafa.12@cse.mrt.ac.lk> wrote:
> >
> >> Hi,
> >>
> >> Thanks for the explanation.
> >> But there are so many things I still don't understand. One of them is
> for
> >> the avg function itself there are several FuntionIdentifiers. What do
> they
> >> all mean?
> >>
> >> I see AVG, LOCAL_AVG, INTERMEDIATE_AVG and GLOBAL_AVG.
> >>
> >> What do they all mean?
> >> Please help
> >>
> >> On 19 July 2017 at 21:56, Yingyi Bu <buyingyi@gmail.com> wrote:
> >>
> >> > Hi Riyafa,
> >> >
> >> >    >> ScalarCountAggregateDescriptor
> >> >   It's used for counting a scalar array that appears inside a tuple.
> >> >   For example:
> >> >   SELECT u.id, array_count(u.friends)
> >> >   FROM users u;
> >> >
> >> >    >> SerializableCountAggregateDescriptor
> >> >    Serialized aggregation descriptor implementations are only used in
> >> > hash-based group-by.
> >> >    For example:
> >> >    SELECT u.city, count(*)
> >> >    FROM users u
> >> >    /*+ hash */
> >> >    GROUP BY u.city;
> >> >
> >> >   If your aggregation function doesn't have a fixed-byte-sized state,
> >> you
> >> > don't need to worry about that or implement that.
> >> >
> >> >    >> CountAggregateDescriptor
> >> >    This is used in group-by or global aggregate:
> >> >    For example:
> >> >    SELECT u.city, count(*)
> >> >    FROM users u
> >> >    GROUP BY u.city;
> >> >
> >> >    SELECT count(*) FROM users;
> >> >
> >> >
> >> > Best,
> >> > Yingyi
> >> >
> >> >
> >> > On Wed, Jul 19, 2017 at 7:55 AM, Riyafa Abdul Hameed <
> riyafa@apache.org
> >> >
> >> > wrote:
> >> >
> >> > > Hi again,
> >> > >
> >> > > Any suggestions on this? Or anyone I can reach to who are not on
> this
> >> > list
> >> > > or not active on the list?
> >> > >
> >> > > Thank you.
> >> > >
> >> > > On 17 July 2017 at 17:18, Riyafa Abdul Hameed <riyafa@apache.org>
> >> wrote:
> >> > >
> >> > > > Hi again,
> >> > > >
> >> > > > I think I can understand how to write the descriptor in the
> >> packages:
> >> > > > org.apache.asterix.runtime.aggregates.std and
> >> > > org.apache.asterix.runtime.aggregates.scalar.
> >> > > > But I am not sure I understand how to write the descriptor in
the
> >> > > package:
> >> > > > org.apache.asterix.runtime.aggregates.serializable.std  because
> it
> >> > > > requires setting a state in the init function that doesn't seem
to
> >> > have a
> >> > > > pattern in the other descriptors.
> >> > > > Also I don't seem to understand the reasons for implementing
each
> of
> >> > > these
> >> > > > descriptors for the aggregate functions.
> >> > > >
> >> > > > On 17 July 2017 at 16:56, Riyafa Abdul Hameed <
> >> riyafa.12@cse.mrt.ac.lk
> >> > >
> >> > > > wrote:
> >> > > >
> >> > > >> Hi all,
> >> > > >>
> >> > > >> I meant any explanation on the implementation of aggregate
> >> functions
> >> > in
> >> > > >> AsterixDB would be highly appreciated.
> >> > > >>
> >> > > >> Thank you.
> >> > > >> Yours sincerely,
> >> > > >> Riyafa
> >> > > >>
> >> > > >> On 16 July 2017 at 08:01, Riyafa Abdul Hameed <riyafa@apache.org
> >
> >> > > wrote:
> >> > > >>
> >> > > >>> Dear all,
> >> > > >>>
> >> > > >>> I am trying to create aggregate functions and I see there
are
> more
> >> > than
> >> > > >>> one function descriptors for one single function.
> >> > > >>> For example the function array_count(collection) has
the
> following
> >> > > >>> descriptors:
> >> > > >>>
> >> > > >>>
> >> > > >>>    - ScalarCountAggregateDescriptor
> >> > > >>>    - SerializableCountAggregateDescriptor
> >> > > >>>    - CountAggregateDescriptor
> >> > > >>>
> >> > > >>> I am not sure I understand the difference between each
of this.
> >> Can
> >> > you
> >> > > >>> please provide and example or point me to a documentation
entry
> to
> >> > > learn
> >> > > >>> how to properly implement aggregate functions?
> >> > > >>>
> >> > > >>> The function I am trying to implement is ST_Extent.
> >> > > >>> <https://postgis.net/docs/manual-1.4/ST_Extent.html>
> >> > > >>>
> >> > > >>> Thank you.
> >> > > >>>
> >> > > >>> Yours sincerely,
> >> > > >>>
> >> > > >>> Riyafa
> >> > > >>>
> >> > > >>
> >> > > >>
> >> > > >>
> >> > > >> --
> >> > > >> Riyafa Abdul Hameed
> >> > > >> Undergraduate, University of Moratuwa
> >> > > >>
> >> > > >> Email: riyafa.12@cse.mrt.ac.lk
> >> > > >> Website: https://riyafa.wordpress.com/ <
> >> http://riyafa.wordpress.com/>
> >> > > >> <http://facebook.com/riyafa.ahf>  <http://lk.linkedin.com/in/riy
> >> afa>
> >> > > >> <http://twitter.com/Riyafa1>
> >> > > >>
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> >>
> >>
> >> --
> >> Riyafa Abdul Hameed
> >> Undergraduate, University of Moratuwa
> >>
> >> Email: riyafa.12@cse.mrt.ac.lk
> >> Website: https://riyafa.wordpress.com/ <http://riyafa.wordpress.com/>
> >> <http://facebook.com/riyafa.ahf>  <http://lk.linkedin.com/in/riyafa>
> >> <http://twitter.com/Riyafa1>
> >>
> >
> >
>



-- 
Riyafa Abdul Hameed
Undergraduate, University of Moratuwa

Email: riyafa.12@cse.mrt.ac.lk
Website: https://riyafa.wordpress.com/ <http://riyafa.wordpress.com/>
<http://facebook.com/riyafa.ahf>  <http://lk.linkedin.com/in/riyafa>
<http://twitter.com/Riyafa1>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message