Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 996B7200C85 for ; Tue, 25 Apr 2017 03:12:09 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 9802F160BA5; Tue, 25 Apr 2017 01:12:09 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id E15AF160B99 for ; Tue, 25 Apr 2017 03:12:08 +0200 (CEST) Received: (qmail 29898 invoked by uid 500); 25 Apr 2017 01:12:08 -0000 Mailing-List: contact dev-help@phoenix.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@phoenix.apache.org Delivered-To: mailing list dev@phoenix.apache.org Received: (qmail 29770 invoked by uid 99); 25 Apr 2017 01:12:07 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Apr 2017 01:12:07 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 203D5C06D2 for ; Tue, 25 Apr 2017 01:12:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id yd_bWznJsTq3 for ; Tue, 25 Apr 2017 01:12:06 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 9C2845FB7A for ; Tue, 25 Apr 2017 01:12:05 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id DFAB3E0D5C for ; Tue, 25 Apr 2017 01:12:04 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 4C49A21B5E for ; Tue, 25 Apr 2017 01:12:04 +0000 (UTC) Date: Tue, 25 Apr 2017 01:12:04 +0000 (UTC) From: "Vincent Poon (JIRA)" To: dev@phoenix.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (PHOENIX-3806) IndexUpdateManager spending a lot of time sorting mutations on Index rebuild MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 25 Apr 2017 01:12:09 -0000 [ https://issues.apache.org/jira/browse/PHOENIX-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15982230#comment-15982230 ] Vincent Poon commented on PHOENIX-3806: --------------------------------------- What I find is that if I create 100 versions of a row, NonTxIndexBuilder#createTimestampBatchesFromMutation creates 100 batches based on timestamp. Then for each batch, we call addMutationsForBatch, which in step 3 does a loop and adds index updates for all timestamps up to the current batch's timestamp. All the index updates are DELETE updates. What this means is that overall, the # of updates you have is the summation of the series 1...100. So you have something like 5050 index updates, and because of the issues described above, you're sorting 5050 times. And of course as you create more versions, the numbers quickly become unfeasible. > IndexUpdateManager spending a lot of time sorting mutations on Index rebuild > ---------------------------------------------------------------------------- > > Key: PHOENIX-3806 > URL: https://issues.apache.org/jira/browse/PHOENIX-3806 > Project: Phoenix > Issue Type: Bug > Reporter: Lars Hofhansl > > Here's the stack trace. The Array contains 50001 Delete Mutations in this case. > It seems the code is sorting this over and over again. > {code} > Thread 170 (B.DefaultRpcServer.handler=67,queue=7,port=60020): > State: RUNNABLE > Blocked count: 220598 > Waited count: 377933 > Stack: > java.util.TimSort.binarySort(TimSort.java:296) > java.util.TimSort.sort(TimSort.java:239) > java.util.Arrays.sort(Arrays.java:1438) > org.apache.phoenix.hbase.index.covered.update.SortedCollection.iterator(SortedCollection.java:78) > org.apache.phoenix.hbase.index.covered.update.IndexUpdateManager.fixUpCurrentUpdates(IndexUpdateManager.java:128) > org.apache.phoenix.hbase.index.covered.update.IndexUpdateManager.addIndexUpdate(IndexUpdateManager.java:115) > org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.addCurrentStateMutationsForBatch(NonTxIndexBuilder.java:333) > org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.addUpdateForGivenTimestamp(NonTxIndexBuilder.java:258) > org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.addMutationsForBatch(NonTxIndexBuilder.java:231) > org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.batchMutationAndAddUpdates(NonTxIndexBuilder.java:109) > org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.getIndexUpdate(NonTxIndexBuilder.java:71) > org.apache.phoenix.hbase.index.builder.IndexBuildManager$1.call(IndexBuildManager.java:137) > org.apache.phoenix.hbase.index.builder.IndexBuildManager$1.call(IndexBuildManager.java:133) > java.util.concurrent.FutureTask.run(FutureTask.java:266) > com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:293) > com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:61) > org.apache.phoenix.hbase.index.parallel.BaseTaskRunner.submit(BaseTaskRunner.java:58) > org.apache.phoenix.hbase.index.parallel.BaseTaskRunner.submitUninterruptible(BaseTaskRunner.java:99) > org.apache.phoenix.hbase.index.builder.IndexBuildManager.getIndexUpdate(IndexBuildManager.java:144) > org.apache.phoenix.hbase.index.Indexer.preBatchMutateWithExceptions(Indexer.java:324) > Thread 169 (B.DefaultRpcServer.handler=66,queue=6,port=60020): > {code} > [~jamestaylor] -- This message was sent by Atlassian JIRA (v6.3.15#6346)