Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 35FF0DC37 for ; Fri, 28 Sep 2012 14:20:04 +0000 (UTC) Received: (qmail 95794 invoked by uid 500); 28 Sep 2012 14:19:59 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 95534 invoked by uid 500); 28 Sep 2012 14:19:59 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 95527 invoked by uid 99); 28 Sep 2012 14:19:59 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 28 Sep 2012 14:19:59 +0000 X-ASF-Spam-Status: No, hits=0.0 required=5.0 tests=FSL_RCVD_USER,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [75.150.60.225] (HELO adam.ccri.com) (75.150.60.225) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 28 Sep 2012 14:19:51 +0000 Received: from [192.168.200.2] (unknown [192.168.200.2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by adam.ccri.com (Postfix) with ESMTPSA id D53FB260078 for ; Fri, 28 Sep 2012 10:19:22 -0400 (EDT) Message-ID: <5065B1AC.7070009@ccri.com> Date: Fri, 28 Sep 2012 10:18:20 -0400 From: John Armstrong Reply-To: john.armstrong@ccri.com User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: user@hadoop.apache.org Subject: Re: Usefulness of ChainMapper/ChainReducer References: In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On Fri 28 Sep 2012 09:39:13 AM EDT, Harsh J wrote: > Modularity! Exactly! Write a mapper that operates as a filter on something about your keys, then use it in whatever jobs you want. Your job needs to operate on data subset A? chain it with the filter mapper that picks out A. Your next one needs to operate on subset B? chain it with the filter that picks out B!