hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikhail Yakshin <greycat.na....@gmail.com>
Subject Re: Question on UTF-8
Date Thu, 16 Dec 2010 23:47:10 GMT
On Fri, Dec 17, 2010 at 2:02 AM, Sheeba George wrote:
> This must be a simple question . But somehow I am not able to get it to
> work.
> I have a text file which has ISO Latin characters like "CancĂșn".
> The mapper is taking "Text" as the input value.
>
> public
> void map(LongWritable key, Text value, OutputCollector<Text, IntWritable>
> output, Reporter reporter) throws IOException
>
> But the Latin characters are not recognized correctly and it throws a
> MalInputException when I try
> Text.validateUTF8(value.getBytes());

Is recoding your text file as UTF-8 an option?

-- 
WBR, Mikhail Yakshin

Mime
View raw message