Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 5E95A200C01 for ; Wed, 4 Jan 2017 11:23:00 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 5C34F160B39; Wed, 4 Jan 2017 10:23:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 9DF19160B3A for ; Wed, 4 Jan 2017 11:22:59 +0100 (CET) Received: (qmail 4231 invoked by uid 500); 4 Jan 2017 10:22:58 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 4138 invoked by uid 99); 4 Jan 2017 10:22:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Jan 2017 10:22:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 86D3B2C2A6C for ; Wed, 4 Jan 2017 10:22:58 +0000 (UTC) Date: Wed, 4 Jan 2017 10:22:58 +0000 (UTC) From: "Eshcar Hillel (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-17379) Lack of synchronization in CompactionPipeline#getScanners() MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 04 Jan 2017 10:23:00 -0000 [ https://issues.apache.org/jira/browse/HBASE-17379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15797876#comment-15797876 ] Eshcar Hillel commented on HBASE-17379: --------------------------------------- bq. Before HBASE-17081 - you mean the problem was there? Yes. LinkedList iterators are fail-fast: if the list is structurally modified at any time after the iterator is created (except through the Iterator's own remove or add methods), the iterator throws a ConcurrentModificationException. In our case, any for loop like the one below creates a list iterator. {code} for(Segment s : pipeline) {code} If the pipeline is being changed by an in-memory compaction or flush during the lifespan of this iterator we get this exception. Apparently this happens in very rare cases and therefore we didn't get this exception before, however we need to acknowledge the bug and fix it. bq. I also read thro your suggestion of using dirty bit to know if the cached version was really updated. Will that be really needed? You're right. We came up with a simpler solution which is essentially implementing copy-on-write in the compaction pipeline class. I will describe it again here: Add a LinkedList attribute to the class called readOnlyCopy. We already have the lock in all operations that modify the pipeline list. So in the context of these method, while holding the lock we add a line {code} readOnlyCopy = new LinkedList<>(pipeline) {code} The methods that do not change the pipeline add a line {code} LinkedList localCopy = readOnlyCopy {code} and then continue to read/compute/iterate over their local copy which is read-only, and we avoid the exception. This would be a clean simple solution to the problem. > Lack of synchronization in CompactionPipeline#getScanners() > ----------------------------------------------------------- > > Key: HBASE-17379 > URL: https://issues.apache.org/jira/browse/HBASE-17379 > Project: HBase > Issue Type: Bug > Affects Versions: 2.0.0 > Reporter: Ted Yu > Assignee: Ted Yu > Attachments: 17379.v1.txt, 17379.v14.txt, 17379.v2.txt, 17379.v3.txt, 17379.v4.txt, 17379.v5.txt, 17379.v6.txt, 17379.v8.txt > > > From https://builds.apache.org/job/PreCommit-HBASE-Build/5053/testReport/org.apache.hadoop.hbase.regionserver/TestHRegionWithInMemoryFlush/testWritesWhileGetting/ : > {code} > java.io.IOException: java.util.ConcurrentModificationException > at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.handleException(HRegion.java:5886) > at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.initializeScanners(HRegion.java:5856) > at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.(HRegion.java:5819) > at org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:2786) > at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2766) > at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7036) > at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7015) > at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6994) > at org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting(TestHRegion.java:4141) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.util.ConcurrentModificationException: null > at java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966) > at java.util.LinkedList$ListItr.next(LinkedList.java:888) > at org.apache.hadoop.hbase.regionserver.CompactionPipeline.getScanners(CompactionPipeline.java:220) > at org.apache.hadoop.hbase.regionserver.CompactingMemStore.getScanners(CompactingMemStore.java:298) > at org.apache.hadoop.hbase.regionserver.HStore.getScanners(HStore.java:1154) > at org.apache.hadoop.hbase.regionserver.Store.getScanners(Store.java:97) > at org.apache.hadoop.hbase.regionserver.StoreScanner.getScannersNoCompaction(StoreScanner.java:353) > at org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:210) > at org.apache.hadoop.hbase.regionserver.HStore.createScanner(HStore.java:1892) > at org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:1880) > at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.initializeScanners(HRegion.java:5842) > at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.(HRegion.java:5819) > at org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:2786) > at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2766) > at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7036) > {code} > The cause is in CompactionPipeline#getScanners() where there is no synchronization around iterating pipeline. > The code causing ConcurrentModificationException: > {code} > for (Segment segment : this.pipeline) { > {code} > was introduced by HBASE-17081 -- This message was sent by Atlassian JIRA (v6.3.4#6332)