hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei-Chiu Chuang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-12502) SetReplication OutOfMemoryError
Date Mon, 16 Oct 2017 23:27:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-12502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16206769#comment-16206769
] 

Wei-Chiu Chuang commented on HADOOP-12502:
------------------------------------------

Hi [~vinayrpet] thanks for the patch and I finally had the chance to review it.
Overall it looks good to me, and it looks like it also prevents OOM for most of commands,
which is good.

One question though: is it necessary to introduce a new FileSystem API listStatusIterator(final
Path p, final PathFilter filter)?
>From my perspective it seems a useful addition, but doesn't need to be included in this
patch. Adding a new FileSystem API is always concerning.

> SetReplication OutOfMemoryError
> -------------------------------
>
>                 Key: HADOOP-12502
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12502
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 2.3.0
>            Reporter: Philipp Schuegerl
>            Assignee: Vinayakumar B
>         Attachments: HADOOP-12502-01.patch, HADOOP-12502-02.patch, HADOOP-12502-03.patch,
HADOOP-12502-04.patch, HADOOP-12502-05.patch, HADOOP-12502-06.patch
>
>
> Setting the replication of a HDFS folder recursively can run out of memory. E.g. with
a large /var/log directory:
> hdfs dfs -setrep -R -w 1 /var/log
> Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
> 	at java.util.Arrays.copyOfRange(Arrays.java:2694)
> 	at java.lang.String.<init>(String.java:203)
> 	at java.lang.String.substring(String.java:1913)
> 	at java.net.URI$Parser.substring(URI.java:2850)
> 	at java.net.URI$Parser.parse(URI.java:3046)
> 	at java.net.URI.<init>(URI.java:753)
> 	at org.apache.hadoop.fs.Path.initialize(Path.java:203)
> 	at org.apache.hadoop.fs.Path.<init>(Path.java:116)
> 	at org.apache.hadoop.fs.Path.<init>(Path.java:94)
> 	at org.apache.hadoop.hdfs.protocol.HdfsFileStatus.getFullPath(HdfsFileStatus.java:222)
> 	at org.apache.hadoop.hdfs.protocol.HdfsFileStatus.makeQualified(HdfsFileStatus.java:246)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:689)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:712)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:708)
> 	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:708)
> 	at org.apache.hadoop.fs.shell.PathData.getDirectoryContents(PathData.java:268)
> 	at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:347)
> 	at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:308)
> 	at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:347)
> 	at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:308)
> 	at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:347)
> 	at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:308)
> 	at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:347)
> 	at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:308)
> 	at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:347)
> 	at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:308)
> 	at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278)
> 	at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
> 	at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
> 	at org.apache.hadoop.fs.shell.SetReplication.processArguments(SetReplication.java:76)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message