hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ajay Srivastava <Ajay.Srivast...@guavus.com>
Subject Non utf-8 chars in input
Date Tue, 11 Sep 2012 05:54:06 GMT

I am using default inputFormat class for reading input from text files but the input file
has some non utf-8 characters.
I guess that TextInputFormat class is default inputFormat class and it replaces these non
utf-8 chars by "\uFFFD". If I do not want this behavior and need actual char in my mapper
what should be the correct inputFormat class ?

Ajay Srivastava
View raw message