hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <la...@apache.org>
Subject master unhealthy issue in JitterScheduledThreadPoolExecutorImpl, or is it just me?
Date Sat, 05 Dec 2015 07:14:58 GMT
I see that locally all tests that start a mini cluster fail.
In the log I see 1000's of messages like these:2015-12-04 22:55:48,215 ERROR [newbunny,41236,1449298547569_ChoreService_107]
rver.NIOServerCnxnFactory$1(44): Thread Thread[newbunny,41236,1449298547569_ChoreService_107,5,main]
java.lang.IllegalArgumentException: bound must be greater than origin
        at java.util.concurrent.ThreadLocalRandom.nextLong(ThreadLocalRandom.java:430)
        at org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.getDelay(JitterScheduledThreadPoolExecutorImpl.java:84)
        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1083)
        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
        at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

In JitteredRunnableScheduledFuture.getDelay I see this.
      long baseDelay = wrapped.getDelay(unit);
      long spreadTime = (long) (baseDelay * spread);
      long delay = baseDelay + ThreadLocalRandom.current().nextLong(-spreadTime, spreadTime);

So this can fail when spreadTime is 0 (or negative).I suppose to fix is simple not add the
spread if spreadTime if <= 0. And it indeed this fixes the problem for me.

Elliot, you just added that class, mind having a look? Or I'll just file a jira.

-- Lars

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message