hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From burakkk <burak.isi...@gmail.com>
Subject DFSUtil UTF-8 Encoding Problem
Date Mon, 25 Mar 2013 23:15:10 GMT
Hi,
I tried to read a file from HDFS by using below code. I read the bytes and
transform into string and then transform back to byte array again. At this
point even I did nothing besides transforming string to array or reverse,
my file will be corrupted. The special characters are changed with some
others. Do you think why it happens?


for (int cbread; (cbread = in.read(buffer)) >= 0;) {
String str = DFSUtil.bytes2String(buffer);
//Do nothing
byte [] buffer_new = DFSUtil.string2Bytes(str);
}

Before:
99998075¨20120813¨85023644¨-1¨1¨-300¨1¨1829¨1¨-1¨-1¨11919¨1¨283¨7¨-1¨-1¨1¨18¨-1¨-1¨1¨-1¨1¨-1¨1¨-1¨1¨-1¨1¨1¨-1¨1¨-2¨1¨-1¨-1¨2¨-1¨-1¨176¨-1¨N/A

After:
99998075�20120813�85023644�-1�1�-300�1�1829�1�-1�-1�11919�1�283�7�-1�-1�1�18�-1�-1�1�-1�1�-1�1�-1�1�-1�1�1�-1�1�-2�1�-1�-1�2�-1�-1�176�-1�N/A

Thanks
Best Regards...

-- 

*BURAK ISIKLI** *| *http://burakisikli.wordpress.com*
*
*

Mime
View raw message