hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8873) throttle directoryScanner
Date Fri, 11 Sep 2015 22:14:45 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741622#comment-14741622
] 

Colin Patrick McCabe commented on HDFS-8873:
--------------------------------------------

Good work, [~templedf].

{code}
  public static final String  DFS_DATANODE_DIRECTORYSCAN_THROTTLE_KEY = "dfs.datanode.directoryscan.throttle";
{code}
Should be "throttle.type"?

{code}
649 <property>
650	  <name>dfs.datanode.directoryscan.throttle.limit</name>
651	  <value>0</value>
652	  <description>The limit setting for the report compiler threads throttle. The
653	  meaning of this setting is determined by the
654	  dfs.datanode.directoryscan.throttle setting. In all cases, setting this limit
655	  to 0 disables compiler thread throttling. 0 is the default setting.
656	  </description>
657	</property>
{code}
I would prefer to see per-type configuration keys that are more descriptive.  For example,
{{dfs.datanode.directoryscan.timeslice.throttle.ms.per.sec}}.  If we invent more throttle
types later, we can always add more configuration key names.

testTimesliceThrottle: please copy the Configuration object and change the copy rather than
mutating {{TestDirectoryScanner#CONF}}.

{code}
604	      // Waiting should be about 4x running.
605	      ratio = (float)scanner.timeWaiting.get() / scanner.timeRunning.get();
606	
607	      LOG.info("RATIO: " + ratio);
608	      assertTrue("Throttle is too restrictive", ratio <= 4.5);
609	      assertTrue("Throttle is too permissive", ratio >= 3.0);
{code}

I'm a bit concerned that other tests running on the same machine, or GCs could cause delays
here that would make the test fail.  Perhaps we could do this in a loop and keep retrying
until we found that the ratio was correct?

{code}
84	  @VisibleForTesting
85	  final AtomicLong timeRunning = new AtomicLong(0L);
{code}
Should be "timeRunningMs" to reflect the fact that this interval is in milliseconds.  Similar
with "timeWaiting"

{code}
115	    public static ThrottleType parse(String type) {
116	      if (type.trim().equalsIgnoreCase(TIMESLICE.toString())) {
117	        return TIMESLICE;
118	      } else {
119	        return NONE;
120	      }
121	    }
{code}
We should log an ERROR message if we can't understand the throttle type.  Silently defaulting
to doing nothing is not behavior most users will appreciate.

DirectoryScanner.java: there are some unnecessary whitespace changes.

TimesliceThrottler: maybe TimeSliceThrottlerTask is a more appropriate name here?  I like
to think of executors as scheduling tasks.  Arguably the throttler state is all contained
outside the class so it's not really "the throttler."

{code}
1121	      } catch (Throwable t) {
1122	        LOG.error("Throttle thread died unexpectedly", t);
1123	
1124	        if (t instanceof Error) {
1125	          throw t;
1126	        }
1127	      }
{code}
What's the purpose of rethrowing exceptions here?

{code}
  private static class Throttle extends Semaphore {
{code}
While this works, it feels more natural to use a boolean + condition variable here.

{code}
try {
  lock.lock();
  while (blockThreads) {
    cond.wait();
  }
} finally {
  lock.unlock();
}
{code}

{code}
74	  private final ThrottleType throttleType;
75	  private final int throttleLimit;
76	  private ScheduledExecutorService throttleThread;
77	  private Semaphore runningThreads = new Semaphore(0);
78	  private volatile Throttle throttle;
{code}
It feels like this state should be encapsulated inside the Throttler itself.

{code}
860	            // Give our parent a chance to block us for throttling
861	            if (throttle != null) {
862	              throttle.start();
863	            }
{code}
Can we just say that Throttler is always non-null, but sometimes we have a {{NoOpThrottler}}?
 I don't like all this checking for null business.

You could get rid of the type enum and all those explicit type checks, and just have a factory
function inside an interface that creates the appropriate kind of Throttler object from the
Configuration (no-op, timeslice, etc).

> throttle directoryScanner
> -------------------------
>
>                 Key: HDFS-8873
>                 URL: https://issues.apache.org/jira/browse/HDFS-8873
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 2.7.1
>            Reporter: Nathan Roberts
>            Assignee: Daniel Templeton
>         Attachments: HDFS-8873.001.patch
>
>
> The new 2-level directory layout can make directory scans expensive in terms of disk
seeks (see HDFS-8791) for details. 
> It would be good if the directoryScanner() had a configurable duty cycle that would reduce
its impact on disk performance (much like the approach in HDFS-8617). 
> Without such a throttle, disks can go 100% busy for many minutes at a time (assuming
the common case of all inodes in cache but no directory blocks cached, 64K seeks are required
for full directory listing which translates to 655 seconds) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message