hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashutosh Bapat (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-21880) Enable flaky test TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites.
Date Mon, 24 Jun 2019 11:24:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-21880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ashutosh Bapat updated HIVE-21880:
----------------------------------
    Attachment: HIVE-21880.01.patch
        Status: Patch Available  (was: Open)

The code in getNextNotification() just checks whether the next event has the expected event
id. This check may fail when there are multiple events with the same event id or when event
ids are missing. When the test fails, it fails because there multiple events with the same
event id.

We use derby database as backing db for metastore. Derby doesn't lock the row being selected
with FOR UPDATE clause. addNotificationLog() and addNotificationEvent(), both functions, rely
on the this behaviour to generate monotonically increasing sequential event ids. Since the
row is not locked, we could fetch the same event id multiple times and then increment it to
the same value multiple times. That can cause the event ids to progress in unreliable manner.
So for Derby we lock the NOTIFICATION_SEQUENCE table instead of using FOR UPDATE.

Note: TxnHandler uses a different behaviour to simulate the effect of FOR UPDATE on Derby;
it uses a JVM wide mutex for that. TxnHandler is not available always esp. when there are
no ACID tables involved, so we need to move that mutex out of TxnHandler to a place common
to DbNotificationListener and TxnHandler e.g. SQLGenerater and also have to take care of mutex's
reentrant behaviour. Furthermore such a mutex wouldn't work when there are metastores are
running in separate JVMs.

Since the test in Subject is flaky, I have added another test which reliably reproduces this
behaviour.

> Enable flaky test TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites.
> ---------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-21880
>                 URL: https://issues.apache.org/jira/browse/HIVE-21880
>             Project: Hive
>          Issue Type: Bug
>          Components: repl
>    Affects Versions: 4.0.0
>            Reporter: Sankar Hariappan
>            Assignee: Ashutosh Bapat
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-21880.01.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Need tp enable TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites
which is disabled as it is flaky and randomly failing with below error.
> {code}
> Error Message
> Notification events are missing in the meta store.
> Stacktrace
> java.lang.IllegalStateException: Notification events are missing in the meta store.
> 	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getNextNotification(HiveMetaStoreClient.java:3246)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212)
> 	at com.sun.proxy.$Proxy58.getNextNotification(Unknown Source)
> 	at org.apache.hadoop.hive.ql.metadata.events.EventUtils$MSClientNotificationFetcher.getNextNotificationEvents(EventUtils.java:107)
> 	at org.apache.hadoop.hive.ql.metadata.events.EventUtils$NotificationEventIterator.fetchNextBatch(EventUtils.java:159)
> 	at org.apache.hadoop.hive.ql.metadata.events.EventUtils$NotificationEventIterator.hasNext(EventUtils.java:189)
> 	at org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.incrementalDump(ReplDumpTask.java:231)
> 	at org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.execute(ReplDumpTask.java:121)
> 	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212)
> 	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103)
> 	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2709)
> 	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2361)
> 	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2028)
> 	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1788)
> 	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1782)
> 	at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162)
> 	at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223)
> 	at org.apache.hadoop.hive.ql.parse.WarehouseInstance.run(WarehouseInstance.java:227)
> 	at org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:282)
> 	at org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:265)
> 	at org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:289)
> 	at org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites(TestReplicationScenariosAcidTablesBootstrap.java:328)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> 	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> 	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> 	at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> 	at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> 	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> 	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> 	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> 	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> 	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> 	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> 	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> 	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> 	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> 	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> 	at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
> 	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> 	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
> 	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
> 	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
> 	at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
> 	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413)
> {code}
> https://builds.apache.org/job/PreCommit-HIVE-Build/17591/testReport/org.apache.hadoop.hive.ql.parse/TestReplicationScenariosAcidTablesBootstrap/testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message