Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id A6342200B25 for ; Wed, 8 Jun 2016 23:25:29 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id A5043160A2E; Wed, 8 Jun 2016 21:25:29 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 79EFD160A0E for ; Wed, 8 Jun 2016 23:25:28 +0200 (CEST) Received: (qmail 37284 invoked by uid 500); 8 Jun 2016 21:25:27 -0000 Mailing-List: contact dev-help@phoenix.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@phoenix.apache.org Delivered-To: mailing list dev@phoenix.apache.org Received: (qmail 37269 invoked by uid 99); 8 Jun 2016 21:25:27 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Jun 2016 21:25:27 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 059961806B9 for ; Wed, 8 Jun 2016 21:25:27 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -4.646 X-Spam-Level: X-Spam-Status: No, score=-4.646 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.426] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id 0s70BsDdD3dB for ; Wed, 8 Jun 2016 21:25:23 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with SMTP id EB5BA5FD4D for ; Wed, 8 Jun 2016 21:25:22 +0000 (UTC) Received: (qmail 34989 invoked by uid 99); 8 Jun 2016 21:25:21 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Jun 2016 21:25:21 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 3E4D22C1F74 for ; Wed, 8 Jun 2016 21:25:21 +0000 (UTC) Date: Wed, 8 Jun 2016 21:25:21 +0000 (UTC) From: "James Taylor (JIRA)" To: dev@phoenix.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (PHOENIX-2892) Scan for pre-warming the block cache for 2ndary index should be removed MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 08 Jun 2016 21:25:29 -0000 [ https://issues.apache.org/jira/browse/PHOENIX-2892?page=3Dcom.atlassi= an.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D15= 321472#comment-15321472 ]=20 James Taylor commented on PHOENIX-2892: --------------------------------------- By batching, I meant something like this: {code} conn.setAutoCommit(false); for (int i =3D 0; i < 1000; i++) { conn.execute("UPSERT INTO T VALUES (" + i + ")"); } conn.commit(); // Will send over all 1000 rows in one batched mutation {code} In Phoenix, the batching you're doing will still execute a commit after eve= ry row is upserted (you could argue that it shouldn't and it wouldn't be ha= rd to change, but that's the way it works today). > Scan for pre-warming the block cache for 2ndary index should be removed > ----------------------------------------------------------------------- > > Key: PHOENIX-2892 > URL: https://issues.apache.org/jira/browse/PHOENIX-2892 > Project: Phoenix > Issue Type: Bug > Reporter: Enis Soztutar > Assignee: Enis Soztutar > Fix For: 4.8.0 > > Attachments: phoenix-2892_v1.patch > > > We have run into an issue in a mid-sized cluster with secondary indexes. = The problem is that all handlers for doing writes were blocked waiting on a= single scan from the secondary index to complete for > 5mins, thus causing= all incoming RPCs to timeout and causing write un-availability and further= problems (disabling the index, etc). We've taken jstack outputs continuous= ly from the servers to understand what is going on.=20 > In the jstack outputs from that particular server, we can see three types= of stacks (this is raw jstack so the thread names are not there unfortunat= ely).=20 > - First, there are a lot of threads waiting for the MVCC transactions s= tarted previously:=20 > {code} > Thread 15292: (state =3D BLOCKED) > - java.lang.Object.wait(long) @bci=3D0 (Compiled frame; information may = be imprecise) > - org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.wa= itForPreviousTransactionsComplete(org.apache.hadoop.hbase.regionserver.Mult= iVersionConsistencyControl$WriteEntry) @bci=3D86, line=3D253 (Compiled fram= e) > - org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.co= mpleteMemstoreInsertWithSeqNum(org.apache.hadoop.hbase.regionserver.MultiVe= rsionConsistencyControl$WriteEntry, org.apache.hadoop.hbase.regionserver.Se= quenceId) @bci=3D29, line=3D135 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(org.a= pache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) @bci=3D19= 06, line=3D3187 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.ha= doop.hbase.regionserver.HRegion$BatchOperationInProgress) @bci=3D79, line= =3D2819 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.ha= doop.hbase.client.Mutation[], long, long) @bci=3D12, line=3D2761 (Compiled = frame) > - org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(org.apach= e.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, = org.apache.hadoop.hbase.regionserver.Region, org.apache.hadoop.hbase.quotas= .OperationQuota, java.util.List, org.apache.hadoop.hbase.CellScanner) @bci= =3D150, line=3D692 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMu= tation(org.apache.hadoop.hbase.regionserver.Region, org.apache.hadoop.hbase= .quotas.OperationQuota, org.apache.hadoop.hbase.protobuf.generated.ClientPr= otos$RegionAction, org.apache.hadoop.hbase.CellScanner, org.apache.hadoop.h= base.protobuf.generated.ClientProtos$RegionActionResult$Builder, java.util.= List, long) @bci=3D547, line=3D654 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(com.google.pr= otobuf.RpcController, org.apache.hadoop.hbase.protobuf.generated.ClientProt= os$MultiRequest) @bci=3D407, line=3D2032 (Compiled frame) > - org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$= 2.callBlockingMethod(com.google.protobuf.Descriptors$MethodDescriptor, com.= google.protobuf.RpcController, com.google.protobuf.Message) @bci=3D167, lin= e=3D32213 (Compiled frame) > - org.apache.hadoop.hbase.ipc.RpcServer.call(com.google.protobuf.Blockin= gService, com.google.protobuf.Descriptors$MethodDescriptor, com.google.prot= obuf.Message, org.apache.hadoop.hbase.CellScanner, long, org.apache.hadoop.= hbase.monitoring.MonitoredRPCHandler) @bci=3D59, line=3D2114 (Compiled fram= e) > - org.apache.hadoop.hbase.ipc.CallRunner.run() @bci=3D345, line=3D101 (C= ompiled frame) > - org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(java.util.concurr= ent.BlockingQueue) @bci=3D54, line=3D130 (Compiled frame) > - org.apache.hadoop.hbase.ipc.RpcExecutor$1.run() @bci=3D20, line=3D107 = (Interpreted frame) > - java.lang.Thread.run() @bci=3D11, line=3D745 (Interpreted frame) > {code} > The way MVCC works is that it assumes that transactions are short living,= and it guarantees that transactions are committed in strict serial order. = Transactions in this case are write requests coming in and being executed f= rom handlers. Each handler will start a transaction, get a mvcc write index= (which is the mvcc trx number) and does the WAL append + memstore append. = Then it marks the mvcc trx to be complete, and before returning to the user= , we have to guarantee that the transaction is visible. So we wait for the = mvcc read point to be advanced beyond our own write trx number. This is don= e at the above stack trace (waitForPreviousTransactionsComplete). A lot of = threads with this stack means that one or more handlers have started a mvcc= transaction, but did not finish the work and thus did not complete their t= ransactions. MVCC read point can only advance serially, otherwise ongoing t= ransactions will be visible, thus we just wait for the previously started t= ransactions to complete.=20 > - second set of threads are waiting with this stack trace:=20 > {code} > Thread 15274: (state =3D BLOCKED) > - sun.misc.Unsafe.park(boolean, long) @bci=3D0 (Compiled frame; informat= ion may be imprecise) > - java.util.concurrent.locks.LockSupport.parkNanos(java.lang.Object, lon= g) @bci=3D20, line=3D226 (Interpreted frame) > - java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedN= anos(int, long) @bci=3D122, line=3D1033 (Interpreted frame) > - java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireShared= Nanos(int, long) @bci=3D25, line=3D1326 (Interpreted frame) > - java.util.concurrent.CountDownLatch.await(long, java.util.concurrent.T= imeUnit) @bci=3D10, line=3D282 (Interpreted frame) > - org.apache.hadoop.hbase.regionserver.HRegion.getRowLockInternal(byte[]= , boolean) @bci=3D94, line=3D5044 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(org.a= pache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) @bci=3D38= 6, line=3D2962 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.ha= doop.hbase.regionserver.HRegion$BatchOperationInProgress) @bci=3D79, line= =3D2819 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.ha= doop.hbase.client.Mutation[], long, long) @bci=3D12, line=3D2761 (Compiled = frame) > - org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(org.apach= e.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, = org.apache.hadoop.hbase.regionserver.Region, org.apache.hadoop.hbase.quotas= .OperationQuota, java.util.List, org.apache.hadoop.hbase.CellScanner) @bci= =3D150, line=3D692 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMu= tation(org.apache.hadoop.hbase.regionserver.Region, org.apache.hadoop.hbase= .quotas.OperationQuota, org.apache.hadoop.hbase.protobuf.generated.ClientPr= otos$RegionAction, org.apache.hadoop.hbase.CellScanner, org.apache.hadoop.h= base.protobuf.generated.ClientProtos$RegionActionResult$Builder, java.util.= List, long) @bci=3D547, line=3D654 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(com.google.pr= otobuf.RpcController, org.apache.hadoop.hbase.protobuf.generated.ClientProt= os$MultiRequest) @bci=3D407, line=3D2032 (Compiled frame) > - org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$= 2.callBlockingMethod(com.google.protobuf.Descriptors$MethodDescriptor, com.= google.protobuf.RpcController, com.google.protobuf.Message) @bci=3D167, lin= e=3D32213 (Compiled frame) > - org.apache.hadoop.hbase.ipc.RpcServer.call(com.google.protobuf.Blockin= gService, com.google.protobuf.Descriptors$MethodDescriptor, com.google.prot= obuf.Message, org.apache.hadoop.hbase.CellScanner, long, org.apache.hadoop.= hbase.monitoring.MonitoredRPCHandler) @bci=3D59, line=3D2114 (Compiled fram= e) > - org.apache.hadoop.hbase.ipc.CallRunner.run() @bci=3D345, line=3D101 (C= ompiled frame) > - org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(java.util.concurr= ent.BlockingQueue) @bci=3D54, line=3D130 (Compiled frame) > - org.apache.hadoop.hbase.ipc.RpcExecutor$1.run() @bci=3D20, line=3D107 = (Interpreted frame) > - java.lang.Thread.run() @bci=3D11, line=3D745 (Interpreted frame) > {code} > These are the threads that want to write something, but they have to wait= for the row lock(s) that are held from the other threads. It is very likel= y that the threads in the previous set are holding the actual row locks tha= t these set of threads wants to acquire. This is normal behavior, nothing c= oncerning. However, if the handler cannot acquire the row lock in 60 second= s, then the handler will throw an exception, and RPC will be retried. > - Finally the only thread in the third set is doing a scan for the index= update:=20 > {code} > Thread 15276: (state =3D IN_JAVA) > - org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner.p= eek() @bci=3D0, line=3D847 (Compiled frame; information may be imprecise) > - org.apache.hadoop.hbase.regionserver.KeyValueHeap$KVScannerComparator.= compare(org.apache.hadoop.hbase.regionserver.KeyValueScanner, org.apache.ha= doop.hbase.regionserver.KeyValueScanner) @bci=3D2, line=3D181 (Compiled fra= me) > - org.apache.hadoop.hbase.regionserver.KeyValueHeap$KVScannerComparator.= compare(java.lang.Object, java.lang.Object) @bci=3D9, line=3D171 (Compiled = frame) > - java.util.PriorityQueue.siftUpUsingComparator(int, java.lang.Object) @= bci=3D25, line=3D649 (Compiled frame) > - java.util.PriorityQueue.siftUp(int, java.lang.Object) @bci=3D10, line= =3D627 (Compiled frame) > - java.util.PriorityQueue.offer(java.lang.Object) @bci=3D67, line=3D329 = (Compiled frame) > - java.util.PriorityQueue.add(java.lang.Object) @bci=3D2, line=3D306 (Co= mpiled frame) > - org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(bool= ean, org.apache.hadoop.hbase.Cell, boolean, boolean) @bci=3D36, line=3D289 = (Compiled frame) > - org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(org.apache.ha= doop.hbase.Cell) @bci=3D5, line=3D256 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(org.apache.ha= doop.hbase.Cell) @bci=3D53, line=3D817 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(org.= apache.hadoop.hbase.Cell) @bci=3D2, line=3D803 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.StoreScanner.next(java.util.List,= org.apache.hadoop.hbase.regionserver.ScannerContext) @bci=3D876, line=3D63= 6 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(java.util.List,= org.apache.hadoop.hbase.regionserver.ScannerContext) @bci=3D29, line=3D147= (Compiled frame) > - org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populat= eResult(java.util.List, org.apache.hadoop.hbase.regionserver.KeyValueHeap, = org.apache.hadoop.hbase.regionserver.ScannerContext, byte[], int, short) @b= ci=3D22, line=3D5483 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInt= ernal(java.util.List, org.apache.hadoop.hbase.regionserver.ScannerContext) = @bci=3D385, line=3D5634 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw= (java.util.List, org.apache.hadoop.hbase.regionserver.ScannerContext) @bci= =3D29, line=3D5421 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw= (java.util.List) @bci=3D6, line=3D5407 (Compiled frame) > - org.apache.phoenix.index.PhoenixIndexBuilder.batchStarted(org.apache.h= adoop.hbase.regionserver.MiniBatchOperationInProgress) @bci=3D279, line=3D8= 8 (Compiled frame) > - org.apache.phoenix.hbase.index.builder.IndexBuildManager.getIndexUpdat= e(org.apache.hadoop.hbase.regionserver.MiniBatchOperationInProgress, java.u= til.Collection) @bci=3D5, line=3D121 (Compiled frame) > - org.apache.phoenix.hbase.index.Indexer.preBatchMutateWithExceptions(or= g.apache.hadoop.hbase.coprocessor.ObserverContext, org.apache.hadoop.hbase.= regionserver.MiniBatchOperationInProgress) @bci=3D300, line=3D278 (Compiled= frame) > - org.apache.phoenix.hbase.index.Indexer.preBatchMutate(org.apache.hadoo= p.hbase.coprocessor.ObserverContext, org.apache.hadoop.hbase.regionserver.M= iniBatchOperationInProgress) @bci=3D17, line=3D207 (Interpreted frame) > - org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$35.call(org= .apache.hadoop.hbase.coprocessor.RegionObserver, org.apache.hadoop.hbase.co= processor.ObserverContext) @bci=3D6, line=3D991 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOpera= tion.call(org.apache.hadoop.hbase.Coprocessor, org.apache.hadoop.hbase.copr= ocessor.ObserverContext) @bci=3D6, line=3D1673 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperati= on(boolean, org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$Copr= ocessorOperation) @bci=3D88, line=3D1748 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperati= on(org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$CoprocessorOp= eration) @bci=3D3, line=3D1705 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.preBatchMut= ate(org.apache.hadoop.hbase.regionserver.MiniBatchOperationInProgress) @bci= =3D26, line=3D987 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(org.a= pache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) @bci=3D99= 6, line=3D3044 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.ha= doop.hbase.regionserver.HRegion$BatchOperationInProgress) @bci=3D79, line= =3D2819 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.ha= doop.hbase.client.Mutation[], long, long) @bci=3D12, line=3D2761 (Compiled = frame) > - org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(org.apach= e.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, = org.apache.hadoop.hbase.regionserver.Region, org.apache.hadoop.hbase.quotas= .OperationQuota, java.util.List, org.apache.hadoop.hbase.CellScanner) @bci= =3D150, line=3D692 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMu= tation(org.apache.hadoop.hbase.regionserver.Region, org.apache.hadoop.hbase= .quotas.OperationQuota, org.apache.hadoop.hbase.protobuf.generated.ClientPr= otos$RegionAction, org.apache.hadoop.hbase.CellScanner, org.apache.hadoop.h= base.protobuf.generated.ClientProtos$RegionActionResult$Builder, java.util.= List, long) @bci=3D547, line=3D654 (Compiled frame) > - org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(com.google.pr= otobuf.RpcController, org.apache.hadoop.hbase.protobuf.generated.ClientProt= os$MultiRequest) @bci=3D407, line=3D2032 (Compiled frame) > - org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$= 2.callBlockingMethod(com.google.protobuf.Descriptors$MethodDescriptor, com.= google.protobuf.RpcController, com.google.protobuf.Message) @bci=3D167, lin= e=3D32213 (Compiled frame) > - org.apache.hadoop.hbase.ipc.RpcServer.call(com.google.protobuf.Blockin= gService, com.google.protobuf.Descriptors$MethodDescriptor, com.google.prot= obuf.Message, org.apache.hadoop.hbase.CellScanner, long, org.apache.hadoop.= hbase.monitoring.MonitoredRPCHandler) @bci=3D59, line=3D2114 (Compiled fram= e) > - org.apache.hadoop.hbase.ipc.CallRunner.run() @bci=3D345, line=3D101 (C= ompiled frame) > - org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(java.util.concurr= ent.BlockingQueue) @bci=3D54, line=3D130 (Compiled frame) > - org.apache.hadoop.hbase.ipc.RpcExecutor$1.run() @bci=3D20, line=3D107 = (Interpreted frame) > - java.lang.Thread.run() @bci=3D11, line=3D745 (Interpreted frame) > {code} > This thread is doing a "scan" to be able to compute the index updates (no= te HRegion.doMiniBatchMutation -> IndexBuildManager.getIndexUpdate -> Regio= n Scan).=20 > Normally, this scan should finish in a short interval, but for whatever r= eason it takes more than 5 minutes. We know that it takes more than 5 minut= es, because in the jstacks from epoch 1462985796 to epoch 1462985958, we se= e the exact same thread (Thread 15276) continuing the scan.=20 > So, we do a skip scan which takes >5 minutes while holding some row locks= , and holding the MVCC write transaction open. All the other handlers that = start write transactions after this also holds some other row locks and wai= ts for the handler doing the scan to finish. The other handlers are just wa= iting for some of the same row locks. So in effect everybody is waiting on = one handler thread to finish the scan.=20 > So, why we are doing a scan in the index builder code, and why it is taki= ng more than 5 minutes?=20 > Apparently, we are doing a skip scan in PhoenixIndexBuilder:=20 > {code} > public void batchStarted(MiniBatchOperationInProgress miniB= atchOp, IndexMetaData context) throws IOException { > // The entire purpose of this method impl is to get the existing = rows for the > // table rows being indexed into the block cache, as the index ma= intenance code > // does a point scan per row. > List indexMaintainers =3D ((PhoenixIndexMetaData= )context).getIndexMaintainers(); > List keys =3D Lists.newArrayListWithExpectedSize(miniBa= tchOp.size()); > Map maintainers =3D > new HashMap(); > ImmutableBytesWritable indexTableName =3D new ImmutableBytesWrita= ble(); > for (int i =3D 0; i < miniBatchOp.size(); i++) { > Mutation m =3D miniBatchOp.getOperation(i); > keys.add(PVarbinary.INSTANCE.getKeyRange(m.getRow())); > =20 > for(IndexMaintainer indexMaintainer: indexMaintainers) { > if (indexMaintainer.isImmutableRows()) continue; > indexTableName.set(indexMaintainer.getIndexTableName()); > if (maintainers.get(indexTableName) !=3D null) continue; > maintainers.put(indexTableName, indexMaintainer); > } > } > if (maintainers.isEmpty()) return; > Scan scan =3D IndexManagementUtil.newLocalStateScan(new ArrayList= (maintainers.values())); > ScanRanges scanRanges =3D ScanRanges.createPointLookup(keys); > scanRanges.initializeScan(scan); > scan.setFilter(new SkipScanFilter(scanRanges.getSkipScanFilter(),= true)); > Region region =3D env.getRegion(); > RegionScanner scanner =3D region.getScanner(scan); > // Run through the scanner using internal nextRaw method > region.startRegionOperation(); > try { > synchronized (scanner) { > boolean hasMore; > do { > List results =3D Lists.newArrayList(); > // Results are potentially returned even when the ret= urn value of s.next is > // false since this is an indication of whether or no= t there are more values > // after the ones returned > hasMore =3D scanner.nextRaw(results); > } while (hasMore); > } > } > {code} > I think that for some cases, this skip scan turns into an expensive and l= ong scan (instead of point lookup scan). However, there is no easy way to d= ebug the actual rows and the schema from that.=20 > The comments say that it is done to bring the results to the block cache = for the following gets to come. However, there should not be a need for tha= t at all since the reads themselves will bring those results to the block c= ache. The only explanation maybe to use prefetching from the skip scan, but= it is still not correct to do double-work.=20 > The other thing is that we execute the getIndexUpdate()'s in parallel lat= er, but we are still processing the skip scan serially.=20 -- This message was sent by Atlassian JIRA (v6.3.4#6332)