hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Elliott Clark (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15014) Fix filterCellByStore in WALsplitter is awful for performance
Date Fri, 18 Dec 2015 23:33:46 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15064965#comment-15064965
] 

Elliott Clark commented on HBASE-15014:
---------------------------------------

bq.How does it address issue?
arrayList.removeAll is O(n^2) for every item that needs to be removed. O(n) to find the item
and between O(log n) and O(n) to copy the remaining items into the array in the correct positions.
So for files where just about every edit has already been flushed we have a time complexity
of about: O(n) [Finding edits to remove ] + O(n^2) [removing edits]

The solution is to basically change the algorithm to make a single pass and find elements
to keep. So we're at O(n)

bq.This has to be public?
Nope I missed that they were in the same package. Let me get that and one other tweak.


> Fix filterCellByStore in WALsplitter is awful for performance
> -------------------------------------------------------------
>
>                 Key: HBASE-15014
>                 URL: https://issues.apache.org/jira/browse/HBASE-15014
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Elliott Clark
>            Assignee: Elliott Clark
>            Priority: Critical
>         Attachments: HBASE-15014.patch
>
>
> Testing the latest 1.2 I see this when there is a regionserver that crashes.
> {code}
> Thread 921 (RS_LOG_REPLAY_OPS-hbase2698:16020-0-Writer-1):
>   State: RUNNABLE
>   Blocked count: 6354
>   Waited count: 6249
>   Stack:
>     org.apache.hadoop.hbase.KeyValue.equals(KeyValue.java:1128)
>     java.util.ArrayList.indexOf(ArrayList.java:317)
>     java.util.ArrayList.contains(ArrayList.java:300)
>     java.util.ArrayList.batchRemove(ArrayList.java:720)
>     java.util.ArrayList.removeAll(ArrayList.java:690)
>     org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink.filterCellByStore(WALSplitter.java:1529)
>     org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink.append(WALSplitter.java:1557)
>     org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.writeBuffer(WALSplitter.java:1113)
>     org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.doRun(WALSplitter.java:1105)
>     org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.run(WALSplitter.java:1075)
> Thread 920 (RS_LOG_REPLAY_OPS-hbase2698:16020-0-Writer-0):
>   State: TIMED_WAITING
>   Blocked count: 17560
>   Waited count: 19695
>   Stack:
>     java.lang.Object.wait(Native Method)
>     org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.doRun(WALSplitter.java:1093)
>     org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.run(WALSplitter.java:1075)
> Thread 919 (RS_LOG_REPLAY_OPS-hbase2698:16020-0):
>   State: TIMED_WAITING
>   Blocked count: 115
>   Waited count: 976
>   Stack:
>     java.lang.Object.wait(Native Method)
>     org.apache.hadoop.hbase.wal.WALSplitter$EntryBuffers.appendEntry(WALSplitter.java:944)
>     org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:365)
>     org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:236)
>     org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:104)
>     org.apache.hadoop.hbase.regionserver.handler.WALSplitterHandler.process(WALSplitterHandler.java:72)
>     org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
>     java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>     java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>     java.lang.Thread.run(Thread.java:745)
> {code}
> This has been going on for >10 mins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message