Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AAF08D985 for ; Wed, 12 Sep 2012 09:07:03 +0000 (UTC) Received: (qmail 86855 invoked by uid 500); 12 Sep 2012 09:06:58 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 86477 invoked by uid 500); 12 Sep 2012 09:06:58 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 86466 invoked by uid 99); 12 Sep 2012 09:06:57 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 12 Sep 2012 09:06:57 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of lin.yang.jason@gmail.com designates 209.85.220.176 as permitted sender) Received: from [209.85.220.176] (HELO mail-vc0-f176.google.com) (209.85.220.176) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 12 Sep 2012 09:06:50 +0000 Received: by vcbfl11 with SMTP id fl11so2292485vcb.35 for ; Wed, 12 Sep 2012 02:06:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=XE87+NEjwK2sHW9J1Jw0M6MkqXk9ScOhiqHLIyWfB08=; b=rdkN+Q84Pfjqh6vQsrE1TozSu2NGo3LAhyRpSI9RezV3PN1V+nfKZmAWZU3d3Dpr4b gno1PFbg5hGe05FB8YOG8bFb+VZN7BHOPSGtRmkfz4XSxdfDRwsoXqtAwmW3fcrdsOXq I3BTQO/XcYnqq8vksrvQ7j27VNctFnIU9LARNctYyNBBCm5B4uFIVRM8FviG2GH3yYD2 RN8AXBGEKCzVnx+sEjcYyqXUTshLkxfkgCpQuAZHVesOd4xfNAD2J+s0xLaR6SkYjp+y irNNWfToFb0cfqt3qfFDSRZkjpvKul1T5ySFmxlg2MGEQqg6Jp8r/5wYSVetl4m7OpzJ /vqw== Received: by 10.220.157.1 with SMTP id z1mr27973935vcw.12.1347440790170; Wed, 12 Sep 2012 02:06:30 -0700 (PDT) MIME-Version: 1.0 Received: by 10.58.132.176 with HTTP; Wed, 12 Sep 2012 02:06:09 -0700 (PDT) In-Reply-To: References: From: Jason Yang Date: Wed, 12 Sep 2012 17:06:09 +0800 Message-ID: Subject: Re: How to make different mappers execute different processing on a same data ? To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=f46d043890c5a76ac304c97d81d6 --f46d043890c5a76ac304c97d81d6 Content-Type: text/plain; charset=ISO-8859-1 All right, I got it~Thank you very much. 2012/9/11 Harsh J > Hey Jason, > > While I am not sure on whats the best way to automatically "evaluate" > during the execution of a job, the MultipleInputs class offers a way > to run different map implementations within a single job for different > input paths. You could perhaps leverage that with duplicated (or > symlinked?) input paths. > > Otherwise, perhaps do all the N types of computation in a single map() > call, and judge the time inside it at the end of all, before emitting? > > On Tue, Sep 11, 2012 at 9:03 AM, Jason Yang > wrote: > > Hi, all > > > > I've got a question about how to make different mappers execute different > > processing on a same data? > > > > Here is my scenario: > > I got to process a data, however, there multiple choices to process this > > data and I have no idea which one is better, so I was thinking that > maybe I > > could execute multiple mappers, in which different processing solution is > > applied, and eventually the best one is chosen according to some > evaluation > > functions. > > > > But I'm not sure whether this could be done in MapReduce. > > > > Any help would be appreciated. > > > > -- > > YANG, Lin > > > > > > -- > Harsh J > -- YANG, Lin --f46d043890c5a76ac304c97d81d6 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable All right, I got it~Thank you very much.

= 2012/9/11 Harsh J <harsh@cloudera.com>
Hey Jason,

While I am not sure on whats the best way to automatically "evaluate&q= uot;
during the execution of a job, the MultipleInputs class offers a way
to run different map implementations within a single job for different
input paths. You could perhaps leverage that with duplicated (or
symlinked?) input paths.

Otherwise, perhaps do all the N types of computation in a single map()
call, and judge the time inside it at the end of all, before emitting?

On Tue, Sep 11, 2012 at 9:03 AM, Jason Yang <lin.yang.jason@gmail.com> wrote:
> Hi, all
>
> I've got a question about how to make different mappers execute di= fferent
> processing on a same data?
>
> Here is my scenario:
> I got to process a data, however, there multiple choices to process th= is
> data and I have no idea which one is better, so I was thinking that ma= ybe I
> could execute multiple mappers, in which different processing solution= is
> applied, and eventually the best one is chosen according to some evalu= ation
> functions.
>
> But I'm not sure whether this could be done in MapReduce.
>
> Any help would be appreciated.
>
> --
> YANG, Lin
>



--
Harsh J



--
YANG, Lin

--f46d043890c5a76ac304c97d81d6--