hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "anishek (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-16886) HMS log notifications may have duplicated event IDs if multiple HMS are running concurrently
Date Wed, 23 Aug 2017 17:46:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138741#comment-16138741
] 

anishek commented on HIVE-16886:
--------------------------------

auto increments can have holes if

* a transaction was aborted.
* or the sequence generation is happening in code by explicitly calling the next_val on the
sequence behind the auto increment and then doing the insert from the application in which
case the problem mentioned by [~spena] with GC in app can cause holes or unordered inserts.

holes can happen and are not guaranteed because of the above cases, however we should not
care about the case 1 of aborted transactions until we get increasing ordered unique numbers.


as for the second case if we use the auto_increment at DB and not use datanuclues datastore-identity
with auto-increment, we should be able to get them in order. 

the test provided by  [~spena]  works on mysql with both positive and negative mapping. 

negative mapping -- existing mapping with mysql.
positive mapping -- auto increment as part of create table for notification log, on NL_ID,
on a fresh db. 

for the patch rather than using the existing columns i think i will create another column
that will be auto increment at db level.  I will try the fix on a postgres sql db also to
see if there is separate behavior, however if the race condition happens at the db level such
that two transactions are committed from the application (HMS) at the same time but the db
will order them depending on which acquires the lock to auto_increment first. how ever since
replication is not realtime this lag of a few nano/micro secs should not be a problem. 

retrying the whole metastore operation with optimistic locking in application code is just
calling for a lot of retries on HMS side with the possibility of retrying complete metastore
operations to be redone if something fails when one commit is larger than other commits. additionally
this will need us to do perfect distributed transactions on rdbms + hdfs. 

> HMS log notifications may have duplicated event IDs if multiple HMS are running concurrently
> --------------------------------------------------------------------------------------------
>
>                 Key: HIVE-16886
>                 URL: https://issues.apache.org/jira/browse/HIVE-16886
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive, Metastore
>            Reporter: Sergio Peña
>            Assignee: anishek
>         Attachments: datastore-identity-holes.diff, HIVE-16886.1.patch
>
>
> When running multiple Hive Metastore servers and DB notifications are enabled, I could
see that notifications can be persisted with a duplicated event ID. 
> This does not happen when running multiple threads in a single HMS node due to the locking
acquired on the DbNotificationsLog class, but multiple HMS could cause conflicts.
> The issue is in the ObjectStore#addNotificationEvent() method. The event ID fetched from
the datastore is used for the new notification, incremented in the server itself, then persisted
or updated back to the datastore. If 2 servers read the same ID, then these 2 servers write
a new notification with the same ID.
> The event ID is not unique nor a primary key.
> Here's a test case using the TestObjectStore class that confirms this issue:
> {noformat}
> @Test
>   public void testConcurrentAddNotifications() throws ExecutionException, InterruptedException
{
>     final int NUM_THREADS = 2;
>     CountDownLatch countIn = new CountDownLatch(NUM_THREADS);
>     CountDownLatch countOut = new CountDownLatch(1);
>     HiveConf conf = new HiveConf();
>     conf.setVar(HiveConf.ConfVars.METASTORE_EXPRESSION_PROXY_CLASS, MockPartitionExpressionProxy.class.getName());
>     ExecutorService executorService = Executors.newFixedThreadPool(NUM_THREADS);
>     FutureTask<Void> tasks[] = new FutureTask[NUM_THREADS];
>     for (int i=0; i<NUM_THREADS; i++) {
>       final int n = i;
>       tasks[i] = new FutureTask<Void>(new Callable<Void>() {
>         @Override
>         public Void call() throws Exception {
>           ObjectStore store = new ObjectStore();
>           store.setConf(conf);
>           NotificationEvent dbEvent =
>               new NotificationEvent(0, 0, EventMessage.EventType.CREATE_DATABASE.toString(),
"CREATE DATABASE DB" + n);
>           System.out.println("ADDING NOTIFICATION");
>           countIn.countDown();
>           countOut.await();
>           store.addNotificationEvent(dbEvent);
>           System.out.println("FINISH NOTIFICATION");
>           return null;
>         }
>       });
>       executorService.execute(tasks[i]);
>     }
>     countIn.await();
>     countOut.countDown();
>     for (int i = 0; i < NUM_THREADS; ++i) {
>       tasks[i].get();
>     }
>     NotificationEventResponse eventResponse = objectStore.getNextNotification(new NotificationEventRequest());
>     Assert.assertEquals(2, eventResponse.getEventsSize());
>     Assert.assertEquals(1, eventResponse.getEvents().get(0).getEventId());
>     // This fails because the next notification has an event ID = 1
>     Assert.assertEquals(2, eventResponse.getEvents().get(1).getEventId());
>   }
> {noformat}
> The last assertion fails expecting an event ID 1 instead of 2. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message