hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6466) Enable multi-thread for memstore flush
Date Mon, 10 Dec 2012 23:47:22 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528449#comment-13528449
] 

Sergey Shelukhin commented on HBASE-6466:
-----------------------------------------

bq. Yes. Default is single flusher only so default behavior should be as it was.
That is the case:
{code}
    this.handlerCount = conf.getInt("hbase.hstore.flusher.count", 1);
{code}

bq. Looking at the patch, please look elsewhere in code base for how we set threads running.
See how the threads are named... there is some convention in that threads have the name of
their host as a prefix which helps when many regionservers in the one jvm. Could these new
flusher be set up using a thread pool instead? See Threads.java for some facility.
Fixed. Threadpool is to schedule one-off tasks, these threads are intended to run continuously.
I guess one could achieve the same goal with threadpool and counter to limit the concurrency
(except that these threads will be easier to starve) but it seems like a roundabout way to
do is.

bq. Can this class be static?
bq. + private class FlushHandler extends HasThread {
At the cost of few pointers (it currently refers to parent fields); is there reason to do
so?

bq. Pass in the Service Interface so you can query if host has been stopped?
Queries parent "server" field:
{code}
    public void run() {
      while (!server.isStopped()) {
{code} 

bq. If we failed a flush in the past, we'd check the filesystem and if we couldn't write,
we'd abort the server. Does that happen still? Or does the flusher thread just exit?
The behavior should be the same - the uncaught exception handler is set on new runnable-s
the same way as on the old one.

bq. What is going on w/ blockSignal?
Can you please elaborate the question? :)

bq. Why in FSHLog change lock to be a ReentrantReadWriteLock?
flushcache is now called from multiple threads; WAL has entries for cache flush (which according
to discussion in the linked FB JIRA might be unnecessary), with lock held between start and
complete entries. If this lock is kept exclusive, it will cause flush threads to serialize
on it.

bq. How are J-Ds concerns above being addressed in this patch? 
Do you mean the concerns about the lock (discussed above, addressed by original patch), or
about prior lost patch?

bq. Or Elliott's seeing 60 second pauses? 
I haven't seen such conditions on trunk during perf tests.
[~eclark] can you still repro this?
I will probably try to port to 0.94 next, so I can run some lengthy test with normal settings
to see how it goes on 0.94.

bq. Does this do what Jimmy suggests above, flush multiple regions concurrently when there
is memory pressure?
No; there is a response that it doesn't produce substantial wins...
Should it be a separate JIRA?

                
> Enable multi-thread for memstore flush
> --------------------------------------
>
>                 Key: HBASE-6466
>                 URL: https://issues.apache.org/jira/browse/HBASE-6466
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6466.patch, HBASE-6466v2.patch, HBASE-6466v3.1.patch, HBASE-6466v3.patch
>
>
> If the KV is large or Hlog is closed with high-pressure putting, we found memstore is
often above the high water mark and block the putting.
> So should we enable multi-thread for Memstore Flush?
> Some performance test data for reference,
> 1.test environment : 
> random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 regions per
regionserver;row len=50 bytes, value len=1024 bytes;5 regionserver, 300 ipc handler per
regionserver;5 client, 50 thread handler per client for writing
> 2.test results:
> one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per regionserver,
appears many aboveGlobalMemstoreLimit blocking
> two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per regionserver,
> 200 thread handler per client & two cacheFlush handlers, tps:16.1k/s per regionserver,
Flush:18.6MB/s per regionserver

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message