hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wenming Ye <yew.boul...@hotmail.com>
Subject Re: Word count on cluster configuration
Date Mon, 01 Apr 2013 06:53:55 GMT
because many of the “words” are unicode, check the next blog. 
http://blogs.msdn.com/b/hpctrekker/archive/2013/04/01/make-another-small-step-with-the-javascript-console-pig-in-hdinsight.aspx

From: Varsha Raveendran 
Sent: Sunday, March 31, 2013 11:43 PM
To: user@hadoop.apache.org 
Subject: Word count on cluster configuration

Hello! 


I did the setup for a cluster configuration of Hadoop. After running the word count example
the output shown in the part-r-00000 file is as shown : 

hduser@MT2012158:/usr/local/hadoop$ head /tmp/gutenberg-output/gutenberg-output
    40
    2
    4
��� � � � �@��    2
��� � � � �@�@��    1
���� � � � �@�@��    1
P�������� j l k m �������� g��������������������EXTH
� j 2004-01-01d Leonardo    1
P�������� � � � � �������� ���������������������EXTH
� t    1
�P�������� � � � ������������ � � �
���������EXTH � j 2004-01-01d Leonardo    1
�P�������� � � � ������������ � � �
� �����EXTH � t    1




Can you please tell me why this is happening?

   




-- 
-Varsha 

Mime
View raw message