hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eshcar Hillel (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17379) Lack of synchronization in CompactionPipeline#getScanners()
Date Sun, 01 Jan 2017 11:06:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15790952#comment-15790952
] 

Eshcar Hillel commented on HBASE-17379:
---------------------------------------

Here is an even simpler synchronization scheme. [~stack] perhaps this is what you meant in
first place. Bottom line no additional synchronization is needed beyond what we have in the
original design.
- use a list (not thread safe), a version number and a clone of the list (also not thread
safe, however immutable)
-  methods which change the pipeline take an exclusive lock, apply modification, increase
the version number if the suffix of the pipeline have changed, set a new clone of the list,
and finally release the lock
- methods which read the pipeline take a reference of the clone at the beginning of the method
and then work only on the local copy of that clone.

With this we avoid any concurrent-modification-exception without introducing additional synchronization
on read methods, including get and scan operations.

I would like to emphasize that the only reason we need this clone is to avoid this exception,
and *there is no real need in getting an atomic view of the pipeline* for get and scan operations.

Just as there is no atomic view of the active and snapshot segments in the default memstore,
but yet the implementation is correct since it relies on finding the data on disk.
I can elaborate and give examples if this is not clear.

This changes when we add composite snapshot in HBASE-17081.
For example, when computing the size of the pipeline which is now a pair and not a number,
we need first create a local immutable clone of the pipeline and then compute the two parts
of the size (data size and overhead size) in the *same* clone.

So, while the last suggested solution should work correctly, any change to the class should
be made with care.

> Lack of synchronization in CompactionPipeline#getScanners()
> -----------------------------------------------------------
>
>                 Key: HBASE-17379
>                 URL: https://issues.apache.org/jira/browse/HBASE-17379
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.0.0
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>         Attachments: 17379.v1.txt, 17379.v2.txt, 17379.v3.txt, 17379.v4.txt, 17379.v5.txt,
17379.v6.txt, 17379.v8.txt
>
>
> From https://builds.apache.org/job/PreCommit-HBASE-Build/5053/testReport/org.apache.hadoop.hbase.regionserver/TestHRegionWithInMemoryFlush/testWritesWhileGetting/
:
> {code}
> java.io.IOException: java.util.ConcurrentModificationException
> 	at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.handleException(HRegion.java:5886)
> 	at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.initializeScanners(HRegion.java:5856)
> 	at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.<init>(HRegion.java:5819)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:2786)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2766)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7036)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7015)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6994)
> 	at org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting(TestHRegion.java:4141)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> 	at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> 	at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> 	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> 	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> 	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> 	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> 	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> 	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> 	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> 	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
> 	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.ConcurrentModificationException: null
> 	at java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966)
> 	at java.util.LinkedList$ListItr.next(LinkedList.java:888)
> 	at org.apache.hadoop.hbase.regionserver.CompactionPipeline.getScanners(CompactionPipeline.java:220)
> 	at org.apache.hadoop.hbase.regionserver.CompactingMemStore.getScanners(CompactingMemStore.java:298)
> 	at org.apache.hadoop.hbase.regionserver.HStore.getScanners(HStore.java:1154)
> 	at org.apache.hadoop.hbase.regionserver.Store.getScanners(Store.java:97)
> 	at org.apache.hadoop.hbase.regionserver.StoreScanner.getScannersNoCompaction(StoreScanner.java:353)
> 	at org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:210)
> 	at org.apache.hadoop.hbase.regionserver.HStore.createScanner(HStore.java:1892)
> 	at org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:1880)
> 	at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.initializeScanners(HRegion.java:5842)
> 	at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.<init>(HRegion.java:5819)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:2786)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2766)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7036)
> {code}
> The cause is in CompactionPipeline#getScanners() where there is no synchronization around
iterating pipeline.
> The code causing ConcurrentModificationException:
> {code}
>     for (Segment segment : this.pipeline) {
> {code}
> was introduced by HBASE-17081



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message