Return-Path: Delivered-To: apmail-hadoop-hive-dev-archive@minotaur.apache.org Received: (qmail 24245 invoked from network); 29 Jul 2009 06:48:56 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 29 Jul 2009 06:48:56 -0000 Received: (qmail 34623 invoked by uid 500); 29 Jul 2009 06:48:57 -0000 Delivered-To: apmail-hadoop-hive-dev-archive@hadoop.apache.org Received: (qmail 34587 invoked by uid 500); 29 Jul 2009 06:48:57 -0000 Mailing-List: contact hive-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hive-dev@hadoop.apache.org Delivered-To: mailing list hive-dev@hadoop.apache.org Received: (qmail 34577 invoked by uid 99); 29 Jul 2009 06:48:57 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Jul 2009 06:48:57 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Jul 2009 06:48:35 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 34D8C234C044 for ; Tue, 28 Jul 2009 23:48:15 -0700 (PDT) Message-ID: <353463620.1248850095201.JavaMail.jira@brutus> Date: Tue, 28 Jul 2009 23:48:15 -0700 (PDT) From: "Min Zhou (JIRA)" To: hive-dev@hadoop.apache.org Subject: [jira] Commented: (HIVE-607) Create statistical UDFs. In-Reply-To: <1995427556.1246581467171.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HIVE-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12736473#action_12736473 ] Min Zhou commented on HIVE-607: ------------------------------- @Namit I implemented group_cat() in a rush, and found something difficult slove: 1. function group_cat() has a internal order by clause, currently, we can't such aggregation in hive. 2. when the string will be group concated is too large, in another is appears data skew, there is ofen not enough memory to store such a big string. > Create statistical UDFs. > ------------------------ > > Key: HIVE-607 > URL: https://issues.apache.org/jira/browse/HIVE-607 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor > Reporter: S. Alex Smith > Assignee: Emil Ibrishimov > Priority: Minor > Fix For: 0.4.0 > > Attachments: HIVE-607.1.patch, UDAFStddev.java > > > Create UDFs replicating: > STD() Return the population standard deviation > STDDEV_POP()(v5.0.3) Return the population standard deviation > STDDEV_SAMP()(v5.0.3) Return the sample standard deviation > STDDEV() Return the population standard deviation > SUM() Return the sum > VAR_POP()(v5.0.3) Return the population standard variance > VAR_SAMP()(v5.0.3) Return the sample variance > VARIANCE()(v4.1) Return the population standard variance > as found at http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.