Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F08A810633 for ; Fri, 31 May 2013 10:14:59 +0000 (UTC) Received: (qmail 56625 invoked by uid 500); 31 May 2013 10:14:54 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 56056 invoked by uid 500); 31 May 2013 10:14:53 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 56040 invoked by uid 99); 31 May 2013 10:14:52 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 31 May 2013 10:14:52 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of wanghj966@gmail.com designates 209.85.212.177 as permitted sender) Received: from [209.85.212.177] (HELO mail-wi0-f177.google.com) (209.85.212.177) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 31 May 2013 10:14:47 +0000 Received: by mail-wi0-f177.google.com with SMTP id hr14so523034wib.4 for ; Fri, 31 May 2013 03:14:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=oFTfXdRSGQmNfybwJCDPsnNO0spm08OG7EaAoYRa+wY=; b=ziKRoHxq4EMpn3WlFRs9u0S+hJuQsJ1Jy8wuKlfuCfGtk/mfBiG0CUtschaAVO5wnT NQ+JxXBb0CRiR6GC1Uo6xp9mrmrO9u4SjrKJRNmSSnWKj2tacfc4UvO7ojWT+pH5lyfA 5ka7whVYCUxN4G1wHKH3ejbc6UIBd2MNduKu8U5i5w6gSqIsHK6M815MlngUk3w1OXf9 Lb6kKGQoQsfe1b1ECREbf7K8VjZRy/TO4a+pktJ6fZ6N+iU5xKgVXjDTMhInc53opLfA Hn3Evy3DMKH/6rMY+AqGl40gyAgvKPhnKCRev2ct+Nw5ie99jmExsHiCEOK6M9vlXD2Y tU8Q== MIME-Version: 1.0 X-Received: by 10.180.87.33 with SMTP id u1mr2547553wiz.34.1369995267223; Fri, 31 May 2013 03:14:27 -0700 (PDT) Received: by 10.227.228.144 with HTTP; Fri, 31 May 2013 03:14:27 -0700 (PDT) In-Reply-To: References: <7B0D51053A50034199FF706B2513104F1124DC31@SACEXCMBX01-PRD.hq.netapp.com> <210B0118-B51D-445C-99EE-ECC6D8642D29@wizecommerce.com> <7B0D51053A50034199FF706B2513104F1124DC8A@SACEXCMBX01-PRD.hq.netapp.com> Date: Fri, 31 May 2013 18:14:27 +0800 Message-ID: Subject: Re: MapReduce on Local FileSystem From: =?GB2312?B?zfW66b78?= To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=f46d044402b03f366004de00e11c X-Virus-Checked: Checked by ClamAV on apache.org --f46d044402b03f366004de00e11c Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Ingesting the data in HDFS is slow ,Because it need a jvm process. But if you don't use hdfs, you can't benifit from its features. Without hdfs,the big data will not be splited and distributed; I think the initial time of jvm is affordable if data is big, and hadoop is not good choice if the data is small. file:// is cited local data, without distribution, other tasktracker can't cite it until you copy it to the node all tasktrackers reside. 2013/5/31 Harsh J > Then why not simply run with Write Replication Factor set to 1? > > On Fri, May 31, 2013 at 12:54 PM, Agarwal, Nikhil > wrote: > > Hi, > > > > > > > > Thank you for your reply. One simple answer can be to reduce the time > taken > > for ingesting the data in HDFS. > > > > > > > > Regards, > > > > Nikhil > > > > > > > > From: Sanjay Subramanian [mailto:Sanjay.Subramanian@wizecommerce.com] > > Sent: Friday, May 31, 2013 12:50 PM > > To: > > Cc: user@hadoop.apache.org > > > > > > Subject: Re: MapReduce on Local FileSystem > > > > > > > > Basic question. Why would u want to do that ? Also I think the Map R > Hadoop > > distribution has an NFS mountable HDFS > > > > Sanjay > > > > Sent from my iPhone > > > > > > On May 30, 2013, at 11:37 PM, "Agarwal, Nikhil" < > Nikhil.Agarwal@netapp.com> > > wrote: > > > > Hi, > > > > > > > > Is it possible to run MapReduce on multiple nodes using Local File syst= em > > (file:///) ? > > > > I am able to run it in single node setup but in a multiple node setup t= he > > =93slave=94 nodes are not able to access the =93jobtoken=94 file which = is > present in > > the Hadoop.tmp.dir in =93master=94 node. > > > > > > > > Please let me know if it is possible to do this. > > > > > > > > Thanks & Regards, > > > > Nikhil > > > > > > > > CONFIDENTIALITY NOTICE > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > This email message and any attachments are for the exclusive use of the > > intended recipient(s) and may contain confidential and privileged > > information. Any unauthorized review, use, disclosure or distribution i= s > > prohibited. If you are not the intended recipient, please contact the > sender > > by reply email and destroy all copies of the original message along wit= h > any > > attachments, from your computer system. If you are the intended > recipient, > > please be advised that the content of this message is subject to access= , > > review and disclosure by the sender's Email System Administrator. > > > > -- > Harsh J > --f46d044402b03f366004de00e11c Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable
Ingesting the data in HDFS is slow =A0,Because it ne= ed a jvm process. But if you don't use hdfs, you can't benifit from= its features. =A0 Without hdfs,the big data will not be splited and distri= buted; I think =A0the initial time of jvm is affordable if data is big, and= hadoop is not good choice if the data =A0is small.
file:// =A0 is cited local data, without distribution, oth= er tasktracker can't cite it until you copy it to the node all tasktrac= kers reside.


2013/5/= 31 Harsh J <harsh@cloudera.com>
Then why not simply run with Write Replication Factor set to 1?

On Fri, May 31, 2013 at 12:54 PM, Agarwal, Nikhil
<Nikhil.Agarwal@netapp.com> wrote:
> Hi,
>
>
>
> Thank you for your reply. One simple answer can be to reduce the time = taken
> for ingesting the data in HDFS.
>
>
>
> Regards,
>
> Nikhil
>
>
>
> From: Sanjay Subramanian [mailto:Sanjay.Subramanian@wizecommerce.com]
> Sent: Friday, May 31, 2013 12:50 PM
> To: <user@hadoop.apache.o= rg>
> Cc: user@hadoop.apache.org
>
>
> Subject: Re: MapReduce on Local FileSystem
>
>
>
> Basic question. Why would u want to do that ? Also I think the Map R H= adoop
> distribution has an NFS mountable HDFS
>
> Sanjay
>
> Sent from my iPhone
>
>
> On May 30, 2013, at 11:37 PM, "Agarwal, Nikhil" <
Nikhil.Agarwal@netapp.com>
> wrote:
>
> Hi,
>
>
>
> Is it possible to run MapReduce on multiple nodes using Local File sys= tem
> (file:///) =A0?
>
> I am able to run it in single node setup but in a multiple node setup = the
> =93slave=94 nodes are not able to access the =93jobtoken=94 file which= is present in
> the Hadoop.tmp.dir in =93master=94 node.
>
>
>
> Please let me know if it is possible to do this.
>
>
>
> Thanks & Regards,
>
> Nikhil
>
>
>
> CONFIDENTIALITY NOTICE
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> This email message and any attachments are for the exclusive use of th= e
> intended recipient(s) and may contain confidential and privileged
> information. Any unauthorized review, use, disclosure or distribution = is
> prohibited. If you are not the intended recipient, please contact the = sender
> by reply email and destroy all copies of the original message along wi= th any
> attachments, from your computer system. If you are the intended recipi= ent,
> please be advised that the content of this message is subject to acces= s,
> review and disclosure by the sender's Email System Administrator.<= br>


--
Harsh J

--f46d044402b03f366004de00e11c--