Mailing-List: contact hive-dev-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hive-dev@hadoop.apache.org
Message-ID: <4974307.16571274727986105.JavaMail.jira@thor>
Date: Mon, 24 May 2010 15:06:26 -0400 (EDT)
From: "Namit Jain (JIRA)" <jira@apache.org>
To: hive-dev@hadoop.apache.org
Subject: [jira] Commented: (HIVE-287) count distinct on multiple columns
 does not work
In-Reply-To: <886414654.1234467182236.JavaMail.jira@brutus>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870789#action_12870789 ] 

Namit Jain commented on HIVE-287:
---------------------------------

Overall looks good, some minor comments:

1. This should be independent of COUNT - so, all basically all aggregation functions should be supported with DISTINCT.
    For eg: select avg(distinct c1,c2) from T

 and so on.

2. It would be a good idea to maintain some compatibility for the existing interface - so, can we add another method to UDAFResolver, which
    has the new API - and a common class which invokes the default implementation, that would be better.

3. Follows from 1 - more tests are needed

> count distinct on multiple columns does not work
> ------------------------------------------------
>
>                 Key: HIVE-287
>                 URL: https://issues.apache.org/jira/browse/HIVE-287
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Arvind Prabhakar
>             Fix For: 0.6.0
>
>         Attachments: HIVE-287-1.patch
>
>
> The following query does not work:
> select count(distinct col1, col2) from Tbl

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.