phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vincent Poon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-3806) IndexUpdateManager spending a lot of time sorting mutations on Index rebuild
Date Tue, 25 Apr 2017 01:12:04 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15982230#comment-15982230
] 

Vincent Poon commented on PHOENIX-3806:
---------------------------------------

What I find is that if I create 100 versions of a row, NonTxIndexBuilder#createTimestampBatchesFromMutation
creates 100 batches based on timestamp.

Then for each batch, we call addMutationsForBatch, which in step 3 does a loop and adds index
updates for all timestamps up to the current batch's timestamp.  All the index updates are
DELETE updates.

What this means is that overall, the # of updates you have is the summation of the series
1...100.  So you have something like 5050 index updates, and because of the issues described
above, you're sorting 5050 times.

And of course as you create more versions, the numbers quickly become unfeasible.

> IndexUpdateManager spending a lot of time sorting mutations on Index rebuild
> ----------------------------------------------------------------------------
>
>                 Key: PHOENIX-3806
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3806
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>
> Here's the stack trace. The Array contains 50001 Delete Mutations in this case.
> It seems the code is sorting this over and over again.
> {code}
> Thread 170 (B.DefaultRpcServer.handler=67,queue=7,port=60020):
>   State: RUNNABLE
>   Blocked count: 220598
>   Waited count: 377933
>   Stack:
>     java.util.TimSort.binarySort(TimSort.java:296)
>     java.util.TimSort.sort(TimSort.java:239)
>     java.util.Arrays.sort(Arrays.java:1438)
>     org.apache.phoenix.hbase.index.covered.update.SortedCollection.iterator(SortedCollection.java:78)
>     org.apache.phoenix.hbase.index.covered.update.IndexUpdateManager.fixUpCurrentUpdates(IndexUpdateManager.java:128)
>     org.apache.phoenix.hbase.index.covered.update.IndexUpdateManager.addIndexUpdate(IndexUpdateManager.java:115)
>     org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.addCurrentStateMutationsForBatch(NonTxIndexBuilder.java:333)
>     org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.addUpdateForGivenTimestamp(NonTxIndexBuilder.java:258)
>     org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.addMutationsForBatch(NonTxIndexBuilder.java:231)
>     org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.batchMutationAndAddUpdates(NonTxIndexBuilder.java:109)
>     org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.getIndexUpdate(NonTxIndexBuilder.java:71)
>     org.apache.phoenix.hbase.index.builder.IndexBuildManager$1.call(IndexBuildManager.java:137)
>     org.apache.phoenix.hbase.index.builder.IndexBuildManager$1.call(IndexBuildManager.java:133)
>     java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:293)
>     com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:61)
>     org.apache.phoenix.hbase.index.parallel.BaseTaskRunner.submit(BaseTaskRunner.java:58)
>     org.apache.phoenix.hbase.index.parallel.BaseTaskRunner.submitUninterruptible(BaseTaskRunner.java:99)
>     org.apache.phoenix.hbase.index.builder.IndexBuildManager.getIndexUpdate(IndexBuildManager.java:144)
>     org.apache.phoenix.hbase.index.Indexer.preBatchMutateWithExceptions(Indexer.java:324)
> Thread 169 (B.DefaultRpcServer.handler=66,queue=6,port=60020):
> {code}
> [~jamestaylor]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message