hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-157) dfs client -ls/-lsr outofmemory when one directory contained 2 million files.
Date Thu, 17 Jul 2014 19:29:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14065417#comment-14065417
] 

Allen Wittenauer commented on HDFS-157:
---------------------------------------

Is this still an issue?  One fix is to use HADOOP_CLIENT_OPTS to increase the heap, but it
is obviously desirable to have ls do something smarter.  I'm just not sure if it is possible.

> dfs client -ls/-lsr outofmemory when one directory contained 2 million files.
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-157
>                 URL: https://issues.apache.org/jira/browse/HDFS-157
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Koji Noguchi
>            Priority: Minor
>
> Heapsize was set to 1G. 
> It'll be nice if dfs client doesn't require that much memory when listing the directory.
> Exception in thread "IPC Client connection to namenode/11.11.11.111:1111" java.lang.OutOfMemoryError:
GC overhead limit exceeded
>   at java.util.regex.Pattern.compile(Pattern.java:846)
>   at java.lang.String.replace(String.java:2208)
>   at org.apache.hadoop.fs.Path.normalizePath(Path.java:147)
>   at org.apache.hadoop.fs.Path.initialize(Path.java:137)
>   at org.apache.hadoop.fs.Path.<init>(Path.java:126)
>   at org.apache.hadoop.dfs.DFSFileInfo.readFields(DFSFileInfo.java:141)
>   at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:230)
>   at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:166)
>   at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:214)
>   at org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:61)
>   at org.apache.hadoop.ipc.Client$Connection.run(Client.java:273)
> Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
>   at java.util.Arrays.copyOfRange(Arrays.java:3209)
>   at java.lang.String.<init>(String.java:216)
>   at java.lang.StringBuffer.toString(StringBuffer.java:585)
>   at java.net.URI.toString(URI.java:1907)
>   at java.net.URI.<init>(URI.java:732)
>   at org.apache.hadoop.fs.Path.initialize(Path.java:137)
>   at org.apache.hadoop.fs.Path.<init>(Path.java:126)
>   at org.apache.hadoop.fs.Path.makeQualified(Path.java:296)
>   at org.apache.hadoop.dfs.DfsPath.<init>(DfsPath.java:35)
>   at org.apache.hadoop.dfs.DistributedFileSystem.listPaths(DistributedFileSystem.java:181)
>   at org.apache.hadoop.fs.FsShell.ls(FsShell.java:405)
>   at org.apache.hadoop.fs.FsShell.ls(FsShell.java:423)
>   at org.apache.hadoop.fs.FsShell.ls(FsShell.java:423)
>   at org.apache.hadoop.fs.FsShell.ls(FsShell.java:423)
>   at org.apache.hadoop.fs.FsShell.ls(FsShell.java:399)
>   at org.apache.hadoop.fs.FsShell.doall(FsShell.java:1054)
>   at org.apache.hadoop.fs.FsShell.run(FsShell.java:1244)
>   at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:187)
>   at org.apache.hadoop.fs.FsShell.main(FsShell.java:1333)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message