hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anoop Sam John (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-16162) Compacting Memstore : unnecessary push of active segments to pipeline
Date Mon, 04 Jul 2016 05:37:11 GMT

     [ https://issues.apache.org/jira/browse/HBASE-16162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Anoop Sam John updated HBASE-16162:
    Attachment: HBASE-16162_V4.patch

Done bit of a change such that the allowCompaction boolen is called only once in the flow.
ie. Just before in memory compaction. This is after active pushed to pipeline.  I have kept
the meaning of the variable as is.
It was bit strange that in the normal code flow, this variable was deciding even the active
segment push to pipeline and compaction where as with in the flushInMemory() method (and so
in test path) it was considered after pipeline push and so only for compaction.  Now with
this patch we are in sync.  Tests pass.

> Compacting Memstore : unnecessary push of active segments to pipeline
> ---------------------------------------------------------------------
>                 Key: HBASE-16162
>                 URL: https://issues.apache.org/jira/browse/HBASE-16162
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>            Priority: Critical
>         Attachments: HBASE-16162.patch, HBASE-16162_V2.patch, HBASE-16162_V3.patch, HBASE-16162_V4.patch
> We have flow like this
> {code}
> protected void checkActiveSize() {
>     if (shouldFlushInMemory()) {
>          InMemoryFlushRunnable runnable = new InMemoryFlushRunnable();
>       }
>       getPool().execute(runnable);
>     }
>   }
> private boolean shouldFlushInMemory() {
>     if(getActive().getSize() > inmemoryFlushSize) {
>       // size above flush threshold
>       return (allowCompaction.get() && !inMemoryFlushInProgress.get());
>     }
>     return false;
>   }
> void flushInMemory() throws IOException {
>     // Phase I: Update the pipeline
>     getRegionServices().blockUpdates();
>     try {
>       MutableSegment active = getActive();
>       pushActiveToPipeline(active);
>     } finally {
>       getRegionServices().unblockUpdates();
>     }
>     // Phase II: Compact the pipeline
>     try {
>       if (allowCompaction.get() && inMemoryFlushInProgress.compareAndSet(false,
true)) {
>         // setting the inMemoryFlushInProgress flag again for the case this method is
>         // directly (only in tests) in the common path setting from true to true is idempotent
>         // Speculative compaction execution, may be interrupted if flush is forced while
>         // compaction is in progress
>         compactor.startCompaction();
>       }
> {code}
> So every write of cell will produce the check checkActiveSize().   When we are at border
of in mem flush,  many threads doing writes to this memstore can get this checkActiveSize
() to pass.  Yes the AtomicBoolean is still false only. It is turned ON after some time once
the new thread is started run and it push the active to pipeline etc.
> In the new thread code of inMemFlush, we dont have any size check. It just takes the
active segment and pushes that to pipeline. Yes we dont allow any new writes to memstore at
this time.     But before that write lock on region, other handler thread also might have
added entry to this thread pool.  When the 1st one finishes, it releases the lock on region
and handler threads trying for write to memstore, might get lock and add some data. Now this
2nd in mem flush thread may get a chance and get the lock and so it just takes current active
segment and flush that in memory !    This will produce very small sized segments to pipeline.

This message was sent by Atlassian JIRA

View raw message