hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sheeba George <sheeba.geo...@gmail.com>
Subject Question on UTF-8
Date Thu, 16 Dec 2010 23:02:17 GMT
This must be a simple question . But somehow I am not able to get it to
work.
I have a text file which has ISO Latin characters like "CancĂșn".
The mapper is taking "Text" as the input value.

public
void map(LongWritable key, Text value, OutputCollector<Text, IntWritable>
output, Reporter reporter) throws IOException

But the Latin characters are not recognized correctly and it throws a
MalInputException when I try
Text.validateUTF8(value.getBytes());

Any idea how to resolve this.
 Appreciate any help.

Thanks
Sheeba

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message