hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From George Datskos <george.dats...@jp.fujitsu.com>
Subject Re: S3N copy creating recursive folders
Date Thu, 07 Mar 2013 07:51:45 GMT
Subroto and Shumin

Try adding a slash to to the s3n source:

- hadoop fs -cp s3n://acessKey:acessSecret@bucket.name/srcData" 
/test/srcData
+ hadoop fs -cp s3n://acessKey:acessSecret@bucket.name/srcData/" 
/test/srcData

Without the slash, it will keep listing "srcData" each time it is 
listed, leading to the infinite recursion you experienced.


George


> I used to have similar problem. Looks like there is a recursive folder 
> creation bug. How about you try remove the srcData from the <dst>, for 
> example use the following command:
>
> *hadoop fs -cp s3n://acessKey:acessSecret@bucket.name/srcData 
> <http://acessKey:acessSecret@bucket.name/srcData>" /test/*
>
> Or with distcp:
>
> *hadoop distcp s3n://acessKey:acessSecret@bucket.name/srcData 
> <http://acessKey:acessSecret@bucket.name/srcData>" /test/*
>
> HTH.
> Shumin
>
> On Wed, Mar 6, 2013 at 5:44 AM, Subroto <ssanyal@datameer.com 
> <mailto:ssanyal@datameer.com>> wrote:
>
>     Hi Mike,
>
>     I have tries distcp as well and it ended up with exception:
>     13/03/06 05:41:13 INFO tools.DistCp:
>     srcPaths=[s3n://acessKey:acessSecret@dm.test.bucket/srcData]
>     13/03/06 05:41:13 INFO tools.DistCp: destPath=/test/srcData
>     13/03/06 05:41:18 INFO tools.DistCp: /test/srcData does not exist.
>     org.apache.hadoop.tools.DistCp$DuplicationException: Invalid
>     input, there are duplicated files in the sources:
>     s3n://acessKey:acessSecret@dm.test.bucket/srcData/compressed,
>     s3n://acessKey:acessSecret@dm.test.bucket/srcData/compressed
>     at org.apache.hadoop.tools.DistCp.checkDuplication(DistCp.java:1368)
>     at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1176)
>     at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
>     at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>     at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>
>     One more interesting stuff to notice is that same thing works
>     nicely with hadoop 2.0
>
>     Cheers,
>     Subroto Sanyal
>
>     On Mar 6, 2013, at 11:12 AM, Michel Segel wrote:
>
>>     Have you tried using distcp?
>>
>>     Sent from a remote device. Please excuse any typos...
>>
>>     Mike Segel
>>
>>     On Mar 5, 2013, at 8:37 AM, Subroto <ssanyal@datameer.com
>>     <mailto:ssanyal@datameer.com>> wrote:
>>
>>>     Hi,
>>>
>>>     Its not because there are too many recursive folders in S3
>>>     bucket; in-fact there is no recursive folder in the source.
>>>     If I list the S3 bucket with Native S3 tools I can find a file
>>>     srcData with size 0 in the folder srcData.
>>>     The copy command keeps on creating
>>>     folder  /test/srcData/srcData/srcData (keep on appending srcData).
>>>
>>>     Cheers,
>>>     Subroto Sanyal
>>>
>>>     On Mar 5, 2013, at 3:32 PM, 卖报的小行家 wrote:
>>>
>>>>     Hi Subroto,
>>>>
>>>>     I didn't use the s3n filesystem.But  from the output "cp:
>>>>     java.io.IOException: mkdirs: Pathname too long.  Limit 8000
>>>>     characters, 1000 levels.", I think this is because the problem
>>>>     of the path. Is the path longer than 8000 characters or the
>>>>     level is more than 1000?
>>>>     You only have 998 folders.Maybe the last one is more than 8000
>>>>     characters.Why not count the last one's length?
>>>>
>>>>     BRs//Julian
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>     ------------------ Original ------------------
>>>>     *From: * "Subroto"<ssanyal@datameer.com
>>>>     <mailto:ssanyal@datameer.com>>;
>>>>     *Date: * Tue, Mar 5, 2013 10:22 PM
>>>>     *To: * "user"<user@hadoop.apache.org
>>>>     <mailto:user@hadoop.apache.org>>;
>>>>     *Subject: * S3N copy creating recursive folders
>>>>
>>>>     Hi,
>>>>
>>>>     I am using Hadoop 1.0.3 and trying to execute:
>>>>     hadoop fs -cp s3n://acessKey:acessSecret@bucket.name/srcData"
>>>>     /test/srcData
>>>>
>>>>     This ends up with:
>>>>     cp: java.io.IOException: mkdirs: Pathname too long. Limit 8000
>>>>     characters, 1000 levels.
>>>>
>>>>     When I try to list the folder recursively /test/srcData: it
>>>>     lists 998 folders like:
>>>>     drwxr-xr-x   - root supergroup          0 2013-03-05 08:49
>>>>     /test/srcData/srcData
>>>>     drwxr-xr-x   - root supergroup          0 2013-03-05 08:49
>>>>     /test/srcData/srcData/srcData
>>>>     drwxr-xr-x   - root supergroup          0 2013-03-05 08:49
>>>>     /test/srcData/srcData/srcData/srcData
>>>>     drwxr-xr-x   - root supergroup          0 2013-03-05 08:49
>>>>     /test/srcData/srcData/srcData/srcData/srcData
>>>>     drwxr-xr-x   - root supergroup          0 2013-03-05 08:49
>>>>     /test/srcData/srcData/srcData/srcData/srcData/srcData
>>>>
>>>>     Is there a problem with s3n filesystem ??
>>>>
>>>>     Cheers,
>>>>     Subroto Sanyal
>>>
>
>


Mime
View raw message