hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Kolbasov (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HIVE-16886) HMS log notifications may have duplicated event IDs if multiple HMS are running concurrently
Date Tue, 22 Aug 2017 23:41:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137616#comment-16137616
] 

Alexander Kolbasov edited comment on HIVE-16886 at 8/22/17 11:40 PM:
---------------------------------------------------------------------

[~anishek] I was investigating this some time ago and seems like only MySQL with InnoDB has
a mechanism to guarantee precise auto-increment without holes. Seems that neither Oracle nor
PostgreSQL provide such mechanisms (please correct me if I am wrong - I am in no way a DB
expert). I didn't find any generic way to achieve this without manually simulating "optimistic
transactions" which means a combination of defining ID as a primary key and using something
along these lines:

{code}
OPEN TRANSACTION
perform_some_work()
id = retrieve max(id)
persist(id+1)
CLOSE TRANSACTION
{code}

Because of the uniqueness constraint, this either succeeds and increments the value or fails
due to a duplicate value. In both cases we have neither duplicates nor holes. We can never
have a case where fort two IDs K and N where K < N, and K is committed after N.

The current situation is bad not only because we have duplicates but also because when we
read some value N, a smaller value K<N may still appear later and go undetected.



was (Author: akolb):
[~anishek] I was investigating this some time ago and seems like only MySQL with InnoDB has
a mechanism to guarantee precise auto-increment without holes. Seems that neither Oracle nor
PostgreSQL provide such mechanisms (please correct me if I am wrong - I am in no way a DB
expert). I didn't find any generic way to achieve this without manually simulating "optimistic
transactions" which means a combination of defining ID as a primary key and using something
along these lines:

OPEN TRANSACTION
perform_some_work()
id = retrieve max(id)
persist(id+1)
CLOSE TRANSACTION

Because of the uniqueness constraint, this either succeeds and increments the value or fails
due to a duplicate value. In both cases we have neither duplicates nor holes. We can never
have a case where fort two IDs K and N where K < N, and K is committed after N.

The current situation is bad not only because we have duplicates but also because when we
read some value N, a smaller value K<N may still appear later and go undetected.


> HMS log notifications may have duplicated event IDs if multiple HMS are running concurrently
> --------------------------------------------------------------------------------------------
>
>                 Key: HIVE-16886
>                 URL: https://issues.apache.org/jira/browse/HIVE-16886
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive, Metastore
>            Reporter: Sergio Peña
>            Assignee: anishek
>         Attachments: datastore-identity-holes.diff, HIVE-16886.1.patch
>
>
> When running multiple Hive Metastore servers and DB notifications are enabled, I could
see that notifications can be persisted with a duplicated event ID. 
> This does not happen when running multiple threads in a single HMS node due to the locking
acquired on the DbNotificationsLog class, but multiple HMS could cause conflicts.
> The issue is in the ObjectStore#addNotificationEvent() method. The event ID fetched from
the datastore is used for the new notification, incremented in the server itself, then persisted
or updated back to the datastore. If 2 servers read the same ID, then these 2 servers write
a new notification with the same ID.
> The event ID is not unique nor a primary key.
> Here's a test case using the TestObjectStore class that confirms this issue:
> {noformat}
> @Test
>   public void testConcurrentAddNotifications() throws ExecutionException, InterruptedException
{
>     final int NUM_THREADS = 2;
>     CountDownLatch countIn = new CountDownLatch(NUM_THREADS);
>     CountDownLatch countOut = new CountDownLatch(1);
>     HiveConf conf = new HiveConf();
>     conf.setVar(HiveConf.ConfVars.METASTORE_EXPRESSION_PROXY_CLASS, MockPartitionExpressionProxy.class.getName());
>     ExecutorService executorService = Executors.newFixedThreadPool(NUM_THREADS);
>     FutureTask<Void> tasks[] = new FutureTask[NUM_THREADS];
>     for (int i=0; i<NUM_THREADS; i++) {
>       final int n = i;
>       tasks[i] = new FutureTask<Void>(new Callable<Void>() {
>         @Override
>         public Void call() throws Exception {
>           ObjectStore store = new ObjectStore();
>           store.setConf(conf);
>           NotificationEvent dbEvent =
>               new NotificationEvent(0, 0, EventMessage.EventType.CREATE_DATABASE.toString(),
"CREATE DATABASE DB" + n);
>           System.out.println("ADDING NOTIFICATION");
>           countIn.countDown();
>           countOut.await();
>           store.addNotificationEvent(dbEvent);
>           System.out.println("FINISH NOTIFICATION");
>           return null;
>         }
>       });
>       executorService.execute(tasks[i]);
>     }
>     countIn.await();
>     countOut.countDown();
>     for (int i = 0; i < NUM_THREADS; ++i) {
>       tasks[i].get();
>     }
>     NotificationEventResponse eventResponse = objectStore.getNextNotification(new NotificationEventRequest());
>     Assert.assertEquals(2, eventResponse.getEventsSize());
>     Assert.assertEquals(1, eventResponse.getEvents().get(0).getEventId());
>     // This fails because the next notification has an event ID = 1
>     Assert.assertEquals(2, eventResponse.getEvents().get(1).getEventId());
>   }
> {noformat}
> The last assertion fails expecting an event ID 1 instead of 2. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message