phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Swapna Swapna <talktoswa...@gmail.com>
Subject Re: Question on aggregate function
Date Wed, 04 May 2016 07:00:53 GMT
Sure James. Will take a look on the process.

Regards
Swapna

On Tue, May 3, 2016 at 11:39 PM, James Taylor <jamestaylor@apache.org>
wrote:

> Hi Swapna,
> All our aggregate functions allow expressions as arguments and it wouldn't
> make sense to have these new ones be different. A reference to a column is
> also an expression. It doesn't change the HBase data model being sparse.
>
> I think the next step should be for you to submit a patch that the
> community can take a look at, as it's too difficult to discuss this without
> that.
>
> Thanks,
> James
>
> On Tuesday, May 3, 2016, Swapna Swapna <talktoswapna@gmail.com> wrote:
>
> > Hi James,
> >
> > Thanks for your swift response.
> >
> > I wouldn't be able to use the expression in the below query rather I
> would
> > have to provide the columns (as arguments) which I'm interested in to
> > perform the aggregation on respective provided columns.
> >
> > myaggFunc(col1,col2)
> >
> > the reason being, the hbase data is sparsed and I would not know the
> column
> > values. Data fetch is based on a row key.
> >
> > expression example:
> >
> > ID=1 OR NAME='Hi'
> >
> > Regards
> >
> > Swapna
> >
> >
> >
> > On Tue, May 3, 2016 at 7:17 PM, James Taylor <jamestaylor@apache.org
> > <javascript:;>> wrote:
> >
> > > Hi Swapna,
> > > The return type is typically derived from looking at the return types
> of
> > > each of the input arguments and choosing what'll work without losing
> > > precision. For example, take a look at this loop in ExpressionCompiler
> > that
> > > determines this for expressions that are added together:
> > >
> > >         new ArithmeticExpressionFactory() {
> > >             @Override
> > >             public Expression create(ArithmeticParseNode node,
> > > List<Expression> children) throws SQLException {
> > >                 boolean foundDate = false;
> > >                 Determinism determinism = Determinism.ALWAYS;
> > >                 PDataType theType = null;
> > >                 for(int i = 0; i < children.size(); i++) {
> > >
> > > Your probably already doing this, but make sure you don't assume the
> > > arguments are column references, but allow them to be any expression.
> > >
> > > Also, it'd be great to see what you've got so far without handling
> > multiple
> > > arguments to your function (in the form of a pull request) so folks can
> > get
> > > you feedback on your work so far.
> > >
> > > Thanks, and we appreciate the contributions!
> > >
> > > James
> > >
> > > On Tue, May 3, 2016 at 12:59 PM, Swapna Swapna <talktoswapna@gmail.com
> > <javascript:;>>
> > > wrote:
> > >
> > > > Sure,
> > > >
> > > > Hbase data that I have is:
> > > >
> > > > rowkey                us         uk
> > > > 20161001           3            4
> > > > 20161002           1            2
> > > >
> > > >
> > > > select myaggFunc(us) from table :    // this is returning output as :
> > > > 4
> > > > select myaggFunc(uk) from table :    // this is returning output as :
> > > > 6
> > > >
> > > > In similar to that, i'm visualizing the query like: select
> > > > myaggFunc1(us,uk)
> > > > from table;  //with multiple columns
> > > >
> > > > to return output:   (based on the aggregation logic, below results
> are
> > > for
> > > > sum aggregation)
> > > > us   4
> > > > uk   6
> > > >
> > > >
> > > >
> > > > On Tue, May 3, 2016 at 11:33 AM, James Taylor <
> jamestaylor@apache.org
> > <javascript:;>>
> > > > wrote:
> > > >
> > > > > Removing user list (please don't cross post)
> > > > >
> > > > > Can you give us a full example of the query you have in mind?
> > > > >
> > > > > Thanks,
> > > > > James
> > > > >
> > > > > On Tue, May 3, 2016 at 11:14 AM, Swapna Swapna <
> > talktoswapna@gmail.com <javascript:;>
> > > >
> > > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I'm trying to implement aggregate function on multiple columns
> (as
> > an
> > > > > > arguments) like:
> > > > > >
> > > > > > myaggFunc(col1,col2)
> > > > > >
> > > > > > And I would want to return the results by each column after
> > applying
> > > > > > aggregate operation.
> > > > > >
> > > > > > The output would be something like:
> > > > > >
> > > > > > col1, count ( aggregate of all records for col1)
> > > > > > col2, count
> > > > > >
> > > > > > Inorder to return the results in the above format, what is the
> > return
> > > > > data
> > > > > > type (of the method) should I have to choose?
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message