Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D7ECFD8C5 for ; Tue, 28 Aug 2012 15:44:47 +0000 (UTC) Received: (qmail 99610 invoked by uid 500); 28 Aug 2012 15:44:42 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 99549 invoked by uid 500); 28 Aug 2012 15:44:42 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Delivered-To: moderator for user@hadoop.apache.org Received: (qmail 78933 invoked by uid 99); 28 Aug 2012 15:36:54 -0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,MSGID_FROM_MTA_HEADER,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of zta@outlook.com designates 65.55.111.82 as permitted sender) X-Originating-IP: [218.249.60.68] X-EIP: [5gKSjg+Jcw2tcbt4vl4xGhpOywDwLH+t] X-Originating-Email: [zta@outlook.com] Message-ID: From: Tao To: Subject: distcp error. Date: Tue, 28 Aug 2012 23:36:08 +0800 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_0012_01CD8575.E9EC0780" X-Mailer: Microsoft Outlook 14.0 Thread-Index: Ac2FMaa9AabX1U4qQbyGvOQyAqNUMw== Content-Language: zh-cn X-OriginalArrivalTime: 28 Aug 2012 15:36:19.0357 (UTC) FILETIME=[DEA114D0:01CD8532] X-Virus-Checked: Checked by ClamAV on apache.org ------=_NextPart_000_0012_01CD8575.E9EC0780 Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable Hi, all I use distcp copying data from hadoop1.0.3 to hadoop 2.0.1. When the file path(or file name) contain Chinese character, an exception will throw. Like below. I need some help about this. Thanks. =20 =20 =20 =20 [hdfs@host ~]$ hadoop distcp -i -prbugp -m 14 -overwrite -log /tmp/distcp.log = hftp://10.xx.xx.aa:50070/tmp/=D6=D0=CE=C4=C2=B7=BE=B6=B2=E2=CA=D4 hdfs://10.xx.xx.bb:54310/tmp/distcp_test14 12/08/28 23:32:31 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=3Dfalse, syncFolder=3Dfalse, = deleteMissing=3Dfalse, ignoreFailures=3Dtrue, maxMaps=3D14, sslConfigurationFile=3D'null', copyStrategy=3D'uniformsize', sourceFileListing=3Dnull, sourcePaths=3D[hftp://10.xx.xx.aa:50070/tmp/=D6=D0=CE=C4=C2=B7=BE=B6=B2=E2= =CA=D4], targetPath=3Dhdfs://10.xx.xx.bb:54310/tmp/distcp_test14} 12/08/28 23:32:33 INFO tools.DistCp: DistCp job log path: = /tmp/distcp.log 12/08/28 23:32:34 WARN conf.Configuration: io.sort.mb is deprecated. Instead, use mapreduce.task.io.sort.mb 12/08/28 23:32:34 WARN conf.Configuration: io.sort.factor is deprecated. Instead, use mapreduce.task.io.sort.factor 12/08/28 23:32:34 WARN util.NativeCodeLoader: Unable to load = native-hadoop library for your platform... using builtin-java classes where applicable 12/08/28 23:32:36 INFO mapreduce.JobSubmitter: number of splits:1 12/08/28 23:32:36 WARN conf.Configuration: mapred.jar is deprecated. Instead, use mapreduce.job.jar 12/08/28 23:32:36 WARN conf.Configuration: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative 12/08/28 23:32:36 WARN conf.Configuration: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 12/08/28 23:32:36 WARN conf.Configuration: mapred.mapoutput.value.class = is deprecated. Instead, use mapreduce.map.output.value.class 12/08/28 23:32:36 WARN conf.Configuration: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class 12/08/28 23:32:36 WARN conf.Configuration: mapred.job.name is = deprecated. Instead, use mapreduce.job.name 12/08/28 23:32:36 WARN conf.Configuration: mapreduce.inputformat.class = is deprecated. Instead, use mapreduce.job.inputformat.class 12/08/28 23:32:36 WARN conf.Configuration: mapred.output.dir is = deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 12/08/28 23:32:36 WARN conf.Configuration: mapreduce.outputformat.class = is deprecated. Instead, use mapreduce.job.outputformat.class 12/08/28 23:32:36 WARN conf.Configuration: mapred.map.tasks is = deprecated. Instead, use mapreduce.job.maps 12/08/28 23:32:36 WARN conf.Configuration: mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class 12/08/28 23:32:36 WARN conf.Configuration: mapred.working.dir is = deprecated. Instead, use mapreduce.job.working.dir 12/08/28 23:32:37 INFO mapred.ResourceMgrDelegate: Submitted application application_1345831938927_0039 to ResourceManager at = baby20/10.1.1.40:8040 12/08/28 23:32:37 INFO mapreduce.Job: The url to track the job: http://baby20:8088/proxy/application_1345831938927_0039/ 12/08/28 23:32:37 INFO tools.DistCp: DistCp job-id: = job_1345831938927_0039 12/08/28 23:32:37 INFO mapreduce.Job: Running job: = job_1345831938927_0039 12/08/28 23:32:50 INFO mapreduce.Job: Job job_1345831938927_0039 running = in uber mode : false 12/08/28 23:32:50 INFO mapreduce.Job: map 0% reduce 0% 12/08/28 23:33:00 INFO mapreduce.Job: map 100% reduce 0% 12/08/28 23:33:00 INFO mapreduce.Job: Task Id : attempt_1345831938927_0039_m_000000_0, Status : FAILED Error: java.io.IOException: File copy failed: = hftp://10.1.1.26:50070/tmp/=D6=D0 =CE=C4=C2=B7=BE=B6=B2=E2=CA=D4/part-r-00017 --> hdfs://10.1.1.40:54310/tmp/distcp_test14/part-r-00017 at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.ja= va: 262) at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:229) at = org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at = org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:725) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation= .ja va:1232) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147) Caused by: java.io.IOException: Couldn't run retriable-command: Copying hftp://10.1.1.26:50070/tmp/=D6=D0=CE=C4=C2=B7=BE=B6=B2=E2=CA=D4/part-r-00= 017 to hdfs://10.1.1.40:54310/tmp/distcp_test14/part-r-00017 at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.ja= va: 101) at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.ja= va: 258) ... 10 more Caused by: org.apache.hadoop.tools.mapred.RetriableFileCopyCommand$CopyReadException= : java.io.IOException: HTTP_OK expected, received 500 at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(Retriab= leF ileCopyCommand.java:201) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyBytes(Retriab= leF ileCopyCommand.java:167) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyToTmpFile(Ret= ria bleFileCopyCommand.java:112) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableF= ile CopyCommand.java:90) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(Retriab= leF ileCopyCommand.java:71) at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.ja= va: 87) ... 11 more Caused by: java.io.IOException: HTTP_OK expected, received 500 at org.apache.hadoop.hdfs.HftpFileSystem$RangeHeaderInputStream.checkRespons= eCo de(HftpFileSystem.java:381) at org.apache.hadoop.hdfs.ByteRangeInputStream.openInputStream(ByteRangeInpu= tSt ream.java:121) at org.apache.hadoop.hdfs.ByteRangeInputStream.getInputStream(ByteRangeInput= Str eam.java:103) at org.apache.hadoop.hdfs.ByteRangeInputStream.read(ByteRangeInputStream.jav= a:1 58) at java.io.DataInputStream.read(DataInputStream.java:132) at = java.io.BufferedInputStream.read1(BufferedInputStream.java:256) at = java.io.BufferedInputStream.read(BufferedInputStream.java:317) at java.io.FilterInputStream.read(FilterInputStream.java:90) at org.apache.hadoop.tools.util.ThrottledInputStream.read(ThrottledInputStre= am. java:70) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(Retriab= leF ileCopyCommand.java:198) ... 16 more =20 =20 =20 ------=_NextPart_000_0012_01CD8575.E9EC0780 Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable

Hi, = all

         I use = distcp copying data from hadoop1.0.3 to hadoop = 2.0.1.

         When the = file path(or file name) contain Chinese character, an exception will = throw. Like below. I need some help about this.

         = Thanks.

         =

 

 

 

[hdfs@host ~]$ hadoop distcp -i -prbugp -m 14 -overwrite = -log /tmp/distcp.log hftp://10.xx.xx.aa:50070/tmp/=D6=D0=CE=C4=C2=B7=BE=B6=B2=E2=CA=D4<= span lang=3DEN-US> = hdfs://10.xx.xx.bb:54310/tmp/distcp_test14

12/08/28 23:32:31 INFO = tools.DistCp: Input Options: DistCpOptions{atomicCommit=3Dfalse, = syncFolder=3Dfalse, deleteMissing=3Dfalse, ignoreFailures=3Dtrue, = maxMaps=3D14, sslConfigurationFile=3D'null', = copyStrategy=3D'uniformsize', sourceFileListing=3Dnull, = sourcePaths=3D[hftp://10.xx.xx.aa:50070/tmp/=D6=D0=CE=C4=C2=B7=BE=B6=B2=E2=CA=D4<= span lang=3DEN-US>], = targetPath=3Dhdfs://10.xx.xx.bb:54310/tmp/distcp_test14}

12/08/28 23:32:33 INFO = tools.DistCp: DistCp job log path: = /tmp/distcp.log

12/08/28 23:32:34 WARN conf.Configuration: io.sort.mb is = deprecated. Instead, use = mapreduce.task.io.sort.mb

12/08/28 23:32:34 WARN = conf.Configuration: io.sort.factor is deprecated. Instead, use = mapreduce.task.io.sort.factor

12/08/28 23:32:34 WARN = util.NativeCodeLoader: Unable to load native-hadoop library for your = platform... using builtin-java classes where = applicable

12/08/28 23:32:36 INFO mapreduce.JobSubmitter: number of = splits:1

12/08/28 23:32:36 WARN conf.Configuration: mapred.jar is = deprecated. Instead, use mapreduce.job.jar

12/08/28 23:32:36 WARN = conf.Configuration: mapred.map.tasks.speculative.execution is = deprecated. Instead, use = mapreduce.map.speculative

12/08/28 23:32:36 WARN = conf.Configuration: mapred.reduce.tasks is deprecated. Instead, use = mapreduce.job.reduces

12/08/28 23:32:36 WARN conf.Configuration: = mapred.mapoutput.value.class is deprecated. Instead, use = mapreduce.map.output.value.class

12/08/28 23:32:36 WARN = conf.Configuration: mapreduce.map.class is deprecated. Instead, use = mapreduce.job.map.class

12/08/28 23:32:36 WARN conf.Configuration: mapred.job.name = is deprecated. Instead, use mapreduce.job.name

12/08/28 23:32:36 WARN = conf.Configuration: mapreduce.inputformat.class is deprecated. Instead, = use mapreduce.job.inputformat.class

12/08/28 23:32:36 WARN = conf.Configuration: mapred.output.dir is deprecated. Instead, use = mapreduce.output.fileoutputformat.outputdir

12/08/28 23:32:36 WARN = conf.Configuration: mapreduce.outputformat.class is deprecated. Instead, = use mapreduce.job.outputformat.class

12/08/28 23:32:36 WARN = conf.Configuration: mapred.map.tasks is deprecated. Instead, use = mapreduce.job.maps

12/08/28 23:32:36 WARN conf.Configuration: = mapred.mapoutput.key.class is deprecated. Instead, use = mapreduce.map.output.key.class

12/08/28 23:32:36 WARN = conf.Configuration: mapred.working.dir is deprecated. Instead, use = mapreduce.job.working.dir

12/08/28 23:32:37 INFO = mapred.ResourceMgrDelegate: Submitted application = application_1345831938927_0039 to ResourceManager at = baby20/10.1.1.40:8040

12/08/28 23:32:37 INFO mapreduce.Job: The url to track the = job: = http://baby20:8088/proxy/application_1345831938927_0039/

12/08/28 23:32:37 INFO = tools.DistCp: DistCp job-id: = job_1345831938927_0039

12/08/28 23:32:37 INFO mapreduce.Job: Running job: = job_1345831938927_0039

12/08/28 23:32:50 INFO mapreduce.Job: Job = job_1345831938927_0039 running in uber mode : = false

12/08/28 23:32:50 INFO mapreduce.Job:  map 0% reduce = 0%

12/08/28 = 23:33:00 INFO mapreduce.Job:  map 100% reduce = 0%

12/08/28 = 23:33:00 INFO mapreduce.Job: Task Id : = attempt_1345831938927_0039_m_000000_0, Status : = FAILED

Error: java.io.IOException: File copy failed: = hftp://10.1.1.26:50070/tmp/=D6=D0=CE=C4=C2=B7=BE=B6=B2=E2=CA=D4<= span lang=3DEN-US>/part-r-00017 --> = hdfs://10.1.1.40:54310/tmp/distcp_test14/part-r-00017

        at = org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.ja= va:262)

        at = org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:229)

        at = org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45)

        at = org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)=

        at = org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:725)

        at = org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)<= /p>

        at = org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)

        at = java.security.AccessController.doPrivileged(Native = Method)

        at = javax.security.auth.Subject.doAs(Subject.java:396)

<= p class=3DMsoNormal>        at = org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation= .java:1232)

        at = org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)

Caused by: = java.io.IOException: Couldn't run retriable-command: Copying = hftp://10.1.1.26:50070/tmp/=D6=D0=CE=C4=C2=B7=BE=B6=B2=E2=CA=D4<= span lang=3DEN-US>/part-r-00017 to = hdfs://10.1.1.40:54310/tmp/distcp_test14/part-r-00017

        at = org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.ja= va:101)

        at = org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.ja= va:258)

        ... 10 = more

Caused = by: = org.apache.hadoop.tools.mapred.RetriableFileCopyCommand$CopyReadException= : java.io.IOException: HTTP_OK expected, received = 500

        at = org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(Retriab= leFileCopyCommand.java:201)

        at = org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyBytes(Retriab= leFileCopyCommand.java:167)

        at = org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyToTmpFile(Ret= riableFileCopyCommand.java:112)

        at = org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableF= ileCopyCommand.java:90)

        at = org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(Retriab= leFileCopyCommand.java:71)

        at = org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.ja= va:87)

        ... 11 = more

Caused = by: java.io.IOException: HTTP_OK expected, received = 500

        at = org.apache.hadoop.hdfs.HftpFileSystem$RangeHeaderInputStream.checkRespons= eCode(HftpFileSystem.java:381)

        at = org.apache.hadoop.hdfs.ByteRangeInputStream.openInputStream(ByteRangeInpu= tStream.java:121)

        at = org.apache.hadoop.hdfs.ByteRangeInputStream.getInputStream(ByteRangeInput= Stream.java:103)

        at = org.apache.hadoop.hdfs.ByteRangeInputStream.read(ByteRangeInputStream.jav= a:158)

        at = java.io.DataInputStream.read(DataInputStream.java:132)<= /p>

        at = java.io.BufferedInputStream.read1(BufferedInputStream.java:256)

        at = java.io.BufferedInputStream.read(BufferedInputStream.java:317)=

        at = java.io.FilterInputStream.read(FilterInputStream.java:90)

        at = org.apache.hadoop.tools.util.ThrottledInputStream.read(ThrottledInputStre= am.java:70)

        at = org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(Retriab= leFileCopyCommand.java:198)

        ... 16 = more

 

 

 

------=_NextPart_000_0012_01CD8575.E9EC0780--