Hi Todd, To elaborate more on the encoding query : Actually the input file we use while working with Hadoop, may have different encoding types, Like : encoding="UTF-8" (UTF-16, GBK, etc) So I want to know which all encoding types are supported by Hadoop. User Scenario : I want to read from a input text file (suppose file01.txt) which has chinese characters And write it to a output text file (suppose fileo2.txt) and verify whether the chinese characters are coming properly in the output file (and not as junk characters). { It would be appreciable if u cud tell me how to verify this. ) Regards, Naveen Kumar HUAWEI TECHNOLOGIES CO.,LTD. huawei_logo Address: Huawei Industrial Base Bantian Longgang Shenzhen 518129, P.R.China www.huawei.com ---------------------------------------------------------------------------- ------------------------------------- This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! -----Original Message----- From: Todd Lipcon [mailto:todd@cloudera.com] Sent: Friday, January 22, 2010 10:16 AM To: general@hadoop.apache.org; naveenkumarp@huawei.com Subject: Re: encoding types supported by Hadoop Hi Naveen, On Thu, Jan 21, 2010 at 7:54 PM, Naveen Kumar Prasad < naveenkumarp@huawei.com> wrote: > Hi All, > > I am new to hadoop/Mapreduce usage. > > Can anyone tell me how to write a simple MapReduce implementation to > just read some files from the directory and write to > directory. > It sounds like what you want is the distcp job. Just run "hadoop distcp" and it will print some usage information for you. > > Also I wanted to know which all encoding types are supported by Hadoop > and how to configure and use various encoding types. > > I'm not sure what you mean here by encoding. Could you elaborate on this question, please? Thanks -Todd