Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4E167F7A3 for ; Mon, 1 Apr 2013 06:54:27 +0000 (UTC) Received: (qmail 59504 invoked by uid 500); 1 Apr 2013 06:54:23 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 58037 invoked by uid 500); 1 Apr 2013 06:54:21 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 58026 invoked by uid 99); 1 Apr 2013 06:54:21 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Apr 2013 06:54:21 +0000 X-ASF-Spam-Status: No, hits=3.2 required=5.0 tests=FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of yew.boulder@hotmail.com designates 65.55.111.90 as permitted sender) Received: from [65.55.111.90] (HELO blu0-omc2-s15.blu0.hotmail.com) (65.55.111.90) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Apr 2013 06:54:16 +0000 Received: from BLU162-DS20 ([65.55.111.71]) by blu0-omc2-s15.blu0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Sun, 31 Mar 2013 23:53:56 -0700 X-EIP: [+f5lufgzhwinuK8nGW3snvPZ1Fvz0+e7] X-Originating-Email: [yew.boulder@hotmail.com] Message-ID: From: Wenming Ye To: References: In-Reply-To: Subject: Re: Word count on cluster configuration Date: Sun, 31 Mar 2013 23:53:55 -0700 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_000E_01CE2E6B.00F1D470" X-Priority: 3 X-MSMail-Priority: Normal Importance: Normal X-Mailer: Microsoft Windows Live Mail 16.4.3505.912 X-MimeOLE: Produced By Microsoft MimeOLE V16.4.3505.912 X-OriginalArrivalTime: 01 Apr 2013 06:53:56.0240 (UTC) FILETIME=[ADE6F900:01CE2EA5] X-Virus-Checked: Checked by ClamAV on apache.org ------=_NextPart_000_000E_01CE2E6B.00F1D470 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable because many of the =E2=80=9Cwords=E2=80=9D are unicode, check the next = blog.=20 http://blogs.msdn.com/b/hpctrekker/archive/2013/04/01/make-another-small-= step-with-the-javascript-console-pig-in-hdinsight.aspx From: Varsha Raveendran=20 Sent: Sunday, March 31, 2013 11:43 PM To: user@hadoop.apache.org=20 Subject: Word count on cluster configuration Hello!=20 I did the setup for a cluster configuration of Hadoop. After running the = word count example the output shown in the part-r-00000 file is as shown = :=20 hduser@MT2012158:/usr/local/hadoop$ head = /tmp/gutenberg-output/gutenberg-output 40 2 4 =EF=BF=BD=EF=BF=BD=EF=BF=BD =EF=BF=BD =EF=BF=BD =EF=BF=BD = =EF=BF=BD@=EF=BF=BD=EF=BF=BD 2 =EF=BF=BD=EF=BF=BD=EF=BF=BD =EF=BF=BD =EF=BF=BD =EF=BF=BD = =EF=BF=BD@=EF=BF=BD@=EF=BF=BD=EF=BF=BD 1 =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD =EF=BF=BD =EF=BF=BD =EF=BF=BD = =EF=BF=BD@=EF=BF=BD@=EF=BF=BD=EF=BF=BD 1 P=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD= j l k m = =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD = g=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD= =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF= =BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BDEXTH =EF=BF=BD j 2004-01-01d Leonardo = 1 P=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD= =EF=BF=BD =EF=BF=BD =EF=BF=BD =EF=BF=BD = =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD = =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF= =BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF= =BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BDEXTH =EF=BF=BD t 1 =EF=BF=BDP=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD= =EF=BF=BD =EF=BF=BD =EF=BF=BD =EF=BF=BD = =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF= =BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD =EF=BF=BD =EF=BF=BD =EF=BF=BD = =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF= =BF=BDEXTH =EF=BF=BD j 2004-01-01d Leonardo 1 =EF=BF=BDP=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD= =EF=BF=BD =EF=BF=BD =EF=BF=BD =EF=BF=BD = =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF= =BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD =EF=BF=BD =EF=BF=BD =EF=BF=BD = =EF=BF=BD =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BDEXTH =EF=BF=BD t = 1 Can you please tell me why this is happening? =20 --=20 -Varsha=20 ------=_NextPart_000_000E_01CE2E6B.00F1D470 Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: quoted-printable
because many of the =E2=80=9Cwords=E2=80=9D are unicode, check the = next blog.
http://bl= ogs.msdn.com/b/hpctrekker/archive/2013/04/01/make-another-small-step-with= -the-javascript-console-pig-in-hdinsight.aspx
 
Sent: Sunday, March 31, 2013 11:43 PM
Subject: Word count on cluster = configuration
 
Hello!

I did the setup for a cluster configuration of = Hadoop.=20 After running the word count example the output shown in the = part-r-00000 file=20 is as shown :

hduser@MT2012158:/usr/local/hadoop$ head=20 /tmp/gutenberg-output/gutenberg-output
   =20 40
    2
    = 4
=EF=BF=BD=EF=BF=BD=EF=BF=BD =EF=BF=BD =EF=BF=BD =EF=BF=BD=20 =EF=BF=BD@=EF=BF=BD=EF=BF=BD    = 2
=EF=BF=BD=EF=BF=BD=EF=BF=BD =EF=BF=BD =EF=BF=BD =EF=BF=BD = =EF=BF=BD@=EF=BF=BD@=EF=BF=BD=EF=BF=BD    = 1
=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD =EF=BF=BD =EF=BF=BD =EF=BF=BD=20 =EF=BF=BD@=EF=BF=BD@=EF=BF=BD=EF=BF=BD    = 1
P=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF= =BD j l k m = =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD = g=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD= =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF= =BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BDEXTH =EF=BF=BD j 2004-01-01d = Leonardo   =20 1
P=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF= =BD =EF=BF=BD =EF=BF=BD =EF=BF=BD =EF=BF=BD = =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD = =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF= =BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF= =BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BDEXTH =EF=BF=BD = t   =20 1
=EF=BF=BDP=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF= =BD=EF=BF=BD =EF=BF=BD =EF=BF=BD =EF=BF=BD = =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF= =BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD =EF=BF=BD =EF=BF=BD =EF=BF=BD = =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF= =BF=BDEXTH =EF=BF=BD j 2004-01-01d=20 Leonardo    = 1
=EF=BF=BDP=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF= =BD=EF=BF=BD =EF=BF=BD =EF=BF=BD =EF=BF=BD = =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF= =BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD =EF=BF=BD =EF=BF=BD =EF=BF=BD = =EF=BF=BD =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BDEXTH=20 =EF=BF=BD t    1



Can you please tell = me why this is=20 happening?
  



--
-Varsha=20
------=_NextPart_000_000E_01CE2E6B.00F1D470--