Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4819A1818F for ; Fri, 18 Dec 2015 23:33:47 +0000 (UTC) Received: (qmail 93452 invoked by uid 500); 18 Dec 2015 23:33:47 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 93389 invoked by uid 500); 18 Dec 2015 23:33:47 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 93112 invoked by uid 99); 18 Dec 2015 23:33:46 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 Dec 2015 23:33:46 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id A4CFD2C1F6B for ; Fri, 18 Dec 2015 23:33:46 +0000 (UTC) Date: Fri, 18 Dec 2015 23:33:46 +0000 (UTC) From: "Elliott Clark (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-15014) Fix filterCellByStore in WALsplitter is awful for performance MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-15014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15064965#comment-15064965 ] Elliott Clark commented on HBASE-15014: --------------------------------------- bq.How does it address issue? arrayList.removeAll is O(n^2) for every item that needs to be removed. O(n) to find the item and between O(log n) and O(n) to copy the remaining items into the array in the correct positions. So for files where just about every edit has already been flushed we have a time complexity of about: O(n) [Finding edits to remove ] + O(n^2) [removing edits] The solution is to basically change the algorithm to make a single pass and find elements to keep. So we're at O(n) bq.This has to be public? Nope I missed that they were in the same package. Let me get that and one other tweak. > Fix filterCellByStore in WALsplitter is awful for performance > ------------------------------------------------------------- > > Key: HBASE-15014 > URL: https://issues.apache.org/jira/browse/HBASE-15014 > Project: HBase > Issue Type: Bug > Reporter: Elliott Clark > Assignee: Elliott Clark > Priority: Critical > Attachments: HBASE-15014.patch > > > Testing the latest 1.2 I see this when there is a regionserver that crashes. > {code} > Thread 921 (RS_LOG_REPLAY_OPS-hbase2698:16020-0-Writer-1): > State: RUNNABLE > Blocked count: 6354 > Waited count: 6249 > Stack: > org.apache.hadoop.hbase.KeyValue.equals(KeyValue.java:1128) > java.util.ArrayList.indexOf(ArrayList.java:317) > java.util.ArrayList.contains(ArrayList.java:300) > java.util.ArrayList.batchRemove(ArrayList.java:720) > java.util.ArrayList.removeAll(ArrayList.java:690) > org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink.filterCellByStore(WALSplitter.java:1529) > org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink.append(WALSplitter.java:1557) > org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.writeBuffer(WALSplitter.java:1113) > org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.doRun(WALSplitter.java:1105) > org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.run(WALSplitter.java:1075) > Thread 920 (RS_LOG_REPLAY_OPS-hbase2698:16020-0-Writer-0): > State: TIMED_WAITING > Blocked count: 17560 > Waited count: 19695 > Stack: > java.lang.Object.wait(Native Method) > org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.doRun(WALSplitter.java:1093) > org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.run(WALSplitter.java:1075) > Thread 919 (RS_LOG_REPLAY_OPS-hbase2698:16020-0): > State: TIMED_WAITING > Blocked count: 115 > Waited count: 976 > Stack: > java.lang.Object.wait(Native Method) > org.apache.hadoop.hbase.wal.WALSplitter$EntryBuffers.appendEntry(WALSplitter.java:944) > org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:365) > org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:236) > org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:104) > org.apache.hadoop.hbase.regionserver.handler.WALSplitterHandler.process(WALSplitterHandler.java:72) > org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > java.lang.Thread.run(Thread.java:745) > {code} > This has been going on for >10 mins. -- This message was sent by Atlassian JIRA (v6.3.4#6332)