hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nicolas Liochon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11590) use a specific ThreadPoolExecutor
Date Thu, 08 Oct 2015 08:18:26 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14948255#comment-14948255
] 

Nicolas Liochon commented on HBASE-11590:
-----------------------------------------

Hey [~saint.ack@gmail.com]

Attached some tests comparing ThreadPoolExecutor (the one we use currently), ForkJoinPool
(available in jdk1.7+) and LifoThreadPoolExecutorSQP (the one mentionned in the stackoverflow
discussion) .

- the critical use case is:
   1) do a table.batch(puts) that needs a lot of threads
   2) then do a loop { table.get(get) }, this needs a single thread but each call may use
any of the threads in the pool, resetting the keepalive timeout => they may never expire.
ThreadPoolExecutor is actually worse it tries to create a thread even if there are already
enough threads available.

 See the code for the details, but here is the interesting case with a thread pools of 1000
threads while we need only 1 thread.
{quote}
   * ForkJoinPool maxThread=1000, immediateGet=true, LOOP=2000000
   * ForkJoinPool total=68942ms
   * ForkJoinPool step1=68657ms
   * ForkJoinPool step2=284ms
   * ForkJoinPool threads: 6, 1006, 456, 6  <=== we have 456 threads instead of the ideal
7

   * ThreadPoolExecutor maxThread=1000, immediateGet=true, LOOP=2000000
   * ThreadPoolExecutor total=107449ms <=== very slow
   * ThreadPoolExecutor step1=107145ms
   * ThreadPoolExecutor step2=304ms
   * ThreadPoolExecutor threads: 6, 1006, 889, 6 <== keeps nearly all  the threads -
 
   * LifoThreadPoolExecutorSQP maxThread=1000, immediateGet=true, LOOP=2000000
   * LifoThreadPoolExecutorSQP total=4805ms <================ quite fast
   * LifoThreadPoolExecutorSQP step1=4803ms
   * LifoThreadPoolExecutorSQP step2=1ms
   * LifoThreadPoolExecutorSQP threads: 6, 248, 8, 6 <====================== removes the
threads quickly
{quote}

You may want to rerun the tests to see if you reproduce them. I included my results in the
code.

- The root issue is that we need a LIFO poll/lock but it does not exists.
- LifoThreadPoolExecutorSQP solves this with a LIFO queues for the threads waiting for work.
But it
 comes with a LGPL license, and the code is not trivial. A bug there could be difficult to
find. It
  is however incredible to see how faster/better it is compared to the other pools.
- ForkJoinPool is better then TPE. It's not as good as LifoThreadPoolExecutorSQP, but it's
much
 closer to what we need. It's available in the JDK 1.7 it looks like a safe bet for HBase
1.+
 ForkJoinPool: threads are created only if there are waiting tasks. They expire after 2seconds
(it's
  hardcoded in the jdk code). They are not LIFO, and the task allocation is not as fast as
the one in LifoThreadPoolExecutorSQP.

=> Proposition: Let's migrate to ForkJoinPool. If someone has time to try LifoThreadPoolExecutorSQP
it can be interesting in the future (if the license can be changed)...

> use a specific ThreadPoolExecutor
> ---------------------------------
>
>                 Key: HBASE-11590
>                 URL: https://issues.apache.org/jira/browse/HBASE-11590
>             Project: HBase
>          Issue Type: Bug
>          Components: Client, Performance
>    Affects Versions: 1.0.0, 2.0.0
>            Reporter: Nicolas Liochon
>            Assignee: Nicolas Liochon
>            Priority: Minor
>             Fix For: 2.0.0
>
>         Attachments: tp.patch
>
>
> The JDK TPE creates all the threads in the pool. As a consequence, we create (by default)
256 threads even if we just need a few.
> The attached TPE create threads only if we have something in the queue.
> On a PE test with replica on, it improved the 99 latency percentile by 5%. 
> Warning: there are likely some race conditions, but I'm posting it here because there
is may be an implementation available somewhere we can use, or a good reason not to do that.
So feedback welcome as usual. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message