Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 71863 invoked from network); 13 Dec 2010 16:27:39 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 13 Dec 2010 16:27:39 -0000 Received: (qmail 19254 invoked by uid 500); 13 Dec 2010 16:27:36 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 18819 invoked by uid 500); 13 Dec 2010 16:27:35 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 18799 invoked by uid 99); 13 Dec 2010 16:27:35 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Dec 2010 16:27:35 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of qwertymaniac@gmail.com designates 209.85.161.47 as permitted sender) Received: from [209.85.161.47] (HELO mail-fx0-f47.google.com) (209.85.161.47) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Dec 2010 16:27:29 +0000 Received: by fxm17 with SMTP id 17so5988051fxm.34 for ; Mon, 13 Dec 2010 08:27:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type :content-transfer-encoding; bh=RrhTVJ9UlJChvybp/N/yI9fH/OstsOLdzJpS54r0F0M=; b=cOicy5qE9MW8/Jzyszp8VaqjEPf2J+a/R3dVkOfmvd/idNVlJV/mRonpi1EXzFDLkX 8VhLLYuzGm8X05JiY778aRJFkNck4pVgW2zXN231XmQN6pyD2n/ImkpZonze0ctSYphu FGx2JlwoEbKpnBhu0EeEQTZsdwnl1YBqL3Ol0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; b=klBI/l02zwP7jFuslwut3jgAOT+u6d3K4yjmW/QxphzlrvJ3b0hC6PWf+0lDOj23wq 0QtOIbhCcR1GzZEZh3eNbP7ETUoYWVTMsq/arCs5AR/dqYQsADDXJEMB/nSaThOZyVDn Mbg7RNtt+0Zwpvj/SKDeqhcSFqQ++jtiZSk0g= Received: by 10.223.110.148 with SMTP id n20mr28441fap.48.1292257629145; Mon, 13 Dec 2010 08:27:09 -0800 (PST) MIME-Version: 1.0 Received: by 10.223.113.145 with HTTP; Mon, 13 Dec 2010 08:26:48 -0800 (PST) In-Reply-To: References: From: Harsh J Date: Mon, 13 Dec 2010 21:56:48 +0530 Message-ID: Subject: =?windows-1252?Q?Re=3A_How_can_i_realize_the_=93count=28distinct_=29=94_fun?= =?windows-1252?Q?ction_in_hive_=3F?= To: common-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org You don't really need to store all incoming keys. If the input comes sorted, you can rely on matching every two values and incrementing the count accordingly (If you do it in the reduce side it comes sorted by the key, so for non-distinct keys, you would have more than one value; thus all you need to do is count all reduce calls as the grouping does the rest). Just a suggestion to avoid possible memory issues. Correct me if am wrong, please. On Mon, Dec 13, 2010 at 5:36 PM, 1983 ddi wrote: > by I =A0am =A0confused about how can I write the UDAF class, is there any= body > who can give me a favor and thanks a lot if there is an example . About UDFs, read this developer article at Bizo that covers it well enough: http://dev.bizo.com/2009/06/custom-udfs-and-hive.html --=20 Harsh J www.harshj.com