hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8873) throttle directoryScanner
Date Thu, 17 Sep 2015 21:02:05 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14804503#comment-14804503
] 

Colin Patrick McCabe commented on HDFS-8873:
--------------------------------------------

The jenkins errors look like:
{code}
java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.protocol.DatanodeInfo.<init>(Lorg/apache/hadoop/hdfs/protocol/DatanodeID;Ljava/lang/String;JJJJJJJJILorg/apache/hadoop/hdfs/protocol/DatanodeInfo$AdminStates;)V
	at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:591)
{code}

We've seen this before and never managed to track it down.  It seems to be a bug in our Jenkins
integration, possibly related to having multiple maven invocations going on at once sharing
the same .m2 directory.  I will re-trigger the build.

bq. The whitespace error is interesting. I changed line n in the patch. Jenkins complained
about the whitespace on line n+1. I fixed the whitespace on line n+1 in the next patch. Jenkins
is now complaining about the whitespace on line n+2

I would say just leave it alone.  If you didn't introduce the whitespace issue then don't
worry about it.  We really should turn off most of those checkstyle  things since it provides
no value.

> throttle directoryScanner
> -------------------------
>
>                 Key: HDFS-8873
>                 URL: https://issues.apache.org/jira/browse/HDFS-8873
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 2.7.1
>            Reporter: Nathan Roberts
>            Assignee: Daniel Templeton
>         Attachments: HDFS-8873.001.patch, HDFS-8873.002.patch, HDFS-8873.003.patch
>
>
> The new 2-level directory layout can make directory scans expensive in terms of disk
seeks (see HDFS-8791) for details. 
> It would be good if the directoryScanner() had a configurable duty cycle that would reduce
its impact on disk performance (much like the approach in HDFS-8617). 
> Without such a throttle, disks can go 100% busy for many minutes at a time (assuming
the common case of all inodes in cache but no directory blocks cached, 64K seeks are required
for full directory listing which translates to 655 seconds) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message