Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Date: Wed, 3 Jan 2018 08:42:00 +0000 (UTC)
From: "Vinayakumar B (JIRA)" <jira@apache.org>
To: common-issues@hadoop.apache.org
Message-ID: <JIRA.12761808.1418664013000.562405.1514968920384@Atlassian.JIRA>
In-Reply-To: <JIRA.12761808.1418664013000@Atlassian.JIRA>
References: <JIRA.12761808.1418664013000@Atlassian.JIRA> <JIRA.12761808.1418664013674@jira-lw-us.apache.org>
Subject: [jira] [Commented] (HADOOP-12502) SetReplication OutOfMemoryError
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Wed, 03 Jan 2018 08:42:07 -0000


    [ https://issues.apache.org/jira/browse/HADOOP-12502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16309305#comment-16309305 ] 

Vinayakumar B commented on HADOOP-12502:
----------------------------------------

bq. One question though: is it necessary to introduce a new FileSystem API listStatusIterator(final Path p, final PathFilter filter)? From my perspective it seems a useful addition, but doesn't need to be included in this patch. Adding a new FileSystem API is always concerning.
Oh okay. I had added to pass the custom filter as well to iterator. Anyway that can be another Jira itself.

bq. Do we know where the most memory was going? Is it the references to all the listStatus() arrays held in the full recursion call tree in fs/shell/Command.java? (Each recursive call passes reference to that level's listStatus() array, meaning whole tree will be held in heap, right?)
Yes, thats right. Whole tree was held in client(command JVM) memory, causing the OOM.


bq. How does this patch fix the OOM issue? Is it because we're now holding RemoteIterator's for the whole directory tree in memory, instead of holding the actual listStatus arrays?
This problem is seen when each level of directory contains huge number of children.
Consider an example with just 2 level:
Parent : */dir1* -- contains 10000 entries.
Each child dir in /dir1 contains 10000 entries.
 /dir1/subdir1 --> 10000 entries
 /dir1/subdir2 --> 10000 entries
 /dir1/subdir3 --> 10000 entries
.
.
 /dir1/subdir10000 --> 10000 entries

 So total : *10000*10000* entries.
 While processing each subdirectory, *at least 20000 *entries (10000 for parent + 10000 for current subdir) should be kept in memory.

This memory consumption increases as the number of level increases.

With ListStatusIterator implementation (with Limit of 1000 entries at one call) this could be reduced.
*At most 2000 entries* only need to be keep in memory for above problem( 1000 for parent + 1000 for current subdir). Remaining entries will be loaded on the fly.

*Please Note that LocalFileSystem implementation there will not be any difference.* as listStatusIterator() api uses the listStatus() itself internally.
But this could benefit the HDFS implementation which has the listStatusIterator() API implementation at serverside

bq. Why are we forcing ChecksumFilesystem to not use listStatusIterator(), and sorting the results here? This could increase memory usage, no? I don't think sorted iterator is required by the FS contract.
Yes, you are right. FS contract does not ask for sorted items. Removed the _Arrays.sort()._ But since _FileSystem#DEFAULT_FILTER_ includes crc files also, _listStatusIterator()_ should be overridden here.

bq. The Ls.java changes seem tricky. I wonder if there is a simpler way of doing this (idea: Command exposes an overridable boolean isSorted() predicate that Ls.java can override if it needs sorting, and leave the traversal logic in Command instead of mucking with it in Ls?)
Yes, thats good idea, Thank you. Done.

bq. This comment is still true? I'm guessing the intent was "iterative" as in "not recursive", instead of "iterative" as in "using an iterator"
I think intention was to use iterator instead of items array. This patch does exactly that by adding overloaded {{processPaths(PathData parent, RemoteIterator<PathData> itemsIterator)}} method. So removed the comment.


bq. You mean "non-recursive", right? Or maybe "non-iterator".
I meant "non-iterator" method. i.e. Legacy method.

bq. What about the depth++ depth-- accounting in Command.recursePaths() that you skip here? Is the logic that Ls does not use getDepth()? Seems brittle.
Yes, I missed this. Now since {{recursePath()}} handled in {{Command}} itself, {{depth}} is tracked properly.


bq. Why does PathData.getDirectoryContents() sort its listing?
sorting was added for HADOOP-8140

bq. I guess this is much of the memory savings. I guess this chunking into 100 works without changing the depth-first search ordering.
I didnt get you here. can you please explain

bq. What about the sorting in existing Ls#processPaths()? That changes because we now only sort the batches of 100.
This doesnt change the sorting. If sorting was required, then iterator will be avoided in first place. This grouping of 100 items is only required to format the output by calling {{adjustColumnWidths()}} and make it readable. Otherwise {{adjustColumnWidths()}} will be useless.

bq. I like the idea of chunking the depth first search (DFS) into blocks of 100 and releasing references on the way up. Wouldn't we want to do this in Command instead of Ls? Two reasons: (1) other commands benefit (2) less brittle in terms of how recursion logic is wired up between Command and Ls.
Thanks for the suggestion. Moved to Command itself.

Will post a new patch soon.


> SetReplication OutOfMemoryError
> -------------------------------
>
>                 Key: HADOOP-12502
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12502
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 2.3.0
>            Reporter: Philipp Schuegerl
>            Assignee: Vinayakumar B
>         Attachments: HADOOP-12502-01.patch, HADOOP-12502-02.patch, HADOOP-12502-03.patch, HADOOP-12502-04.patch, HADOOP-12502-05.patch, HADOOP-12502-06.patch, HADOOP-12502-07.patch
>
>
> Setting the replication of a HDFS folder recursively can run out of memory. E.g. with a large /var/log directory:
> hdfs dfs -setrep -R -w 1 /var/log
> Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
> 	at java.util.Arrays.copyOfRange(Arrays.java:2694)
> 	at java.lang.String.<init>(String.java:203)
> 	at java.lang.String.substring(String.java:1913)
> 	at java.net.URI$Parser.substring(URI.java:2850)
> 	at java.net.URI$Parser.parse(URI.java:3046)
> 	at java.net.URI.<init>(URI.java:753)
> 	at org.apache.hadoop.fs.Path.initialize(Path.java:203)
> 	at org.apache.hadoop.fs.Path.<init>(Path.java:116)
> 	at org.apache.hadoop.fs.Path.<init>(Path.java:94)
> 	at org.apache.hadoop.hdfs.protocol.HdfsFileStatus.getFullPath(HdfsFileStatus.java:222)
> 	at org.apache.hadoop.hdfs.protocol.HdfsFileStatus.makeQualified(HdfsFileStatus.java:246)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:689)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:712)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:708)
> 	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:708)
> 	at org.apache.hadoop.fs.shell.PathData.getDirectoryContents(PathData.java:268)
> 	at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:347)
> 	at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:308)
> 	at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:347)
> 	at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:308)
> 	at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:347)
> 	at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:308)
> 	at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:347)
> 	at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:308)
> 	at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:347)
> 	at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:308)
> 	at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278)
> 	at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
> 	at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
> 	at org.apache.hadoop.fs.shell.SetReplication.processArguments(SetReplication.java:76)


--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org