Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 94859 invoked from network); 10 Apr 2011 15:29:56 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 10 Apr 2011 15:29:56 -0000 Received: (qmail 29926 invoked by uid 500); 10 Apr 2011 15:29:54 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 29864 invoked by uid 500); 10 Apr 2011 15:29:53 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 29856 invoked by uid 500); 10 Apr 2011 15:29:53 -0000 Delivered-To: apmail-hadoop-core-user@hadoop.apache.org Received: (qmail 29853 invoked by uid 99); 10 Apr 2011 15:29:53 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 10 Apr 2011 15:29:53 +0000 X-ASF-Spam-Status: No, hits=0.0 required=5.0 tests=FREEMAIL_FROM,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of lists@nabble.com designates 216.139.236.26 as permitted sender) Received: from [216.139.236.26] (HELO sam.nabble.com) (216.139.236.26) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 10 Apr 2011 15:29:47 +0000 Received: from isper.nabble.com ([192.168.236.156]) by sam.nabble.com with esmtp (Exim 4.69) (envelope-from ) id 1Q8wZr-0003Q6-1k for core-user@hadoop.apache.org; Sun, 10 Apr 2011 08:29:27 -0700 Message-ID: <31364272.post@talk.nabble.com> Date: Sun, 10 Apr 2011 08:29:27 -0700 (PDT) From: Afflictedd2 To: core-user@hadoop.apache.org Subject: Filtering while mapping or reducing? MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Nabble-From: flethuseo@gmail.com Hi everyone, I need to add the frequencies of two elements with the same last letter, however I don't want single elements in the output of this map reduce. I don't know where the filtering of those elements should either be done in the mapping function or the reduce function? For example, if I have the following pairs: {"f2B"=3D>4, =C2=A0"f1A"=3D>3, =C2=A0"f2C"=3D>5, =C2=A0"f1B"=3D>4, =C2=A0"f2D"=3D>7, =C2=A0"f1C"=3D>5, =C2=A0"f2E"=3D>8, =C2=A0"f1D"=3D>7, =C2=A0"f1F"=3D>8, =C2=A0"f2A"=3D>3} I want the following pairs out {"B"=3D>4, =C2=A0"A"=3D>3, =C2=A0"C"=3D>5, =C2=A0"B"=3D>4, =C2=A0"D"=3D>7, =C2=A0"C"=3D>5, =C2=A0"E"=3D>8, =C2=A0"D"=3D>7, =C2=A0"F"=3D>8, =C2=A0"A"=3D>3} but I don't want E and F in my final result of reduced pairs {"B"=3D>8, =C2=A0"A"=3D>6, =C2=A0"C"=3D>10, =C2=A0"D"=3D>14, } --=20 View this message in context: http://old.nabble.com/Filtering-while-mapping= -or-reducing--tp31364272p31364272.html Sent from the Hadoop core-user mailing list archive at Nabble.com.