hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mohit Sabharwal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-14187) JDOPersistenceManager objects remain cached if MetaStoreClient#close is not called
Date Thu, 07 Jul 2016 20:27:10 GMT

    [ https://issues.apache.org/jira/browse/HIVE-14187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15366713#comment-15366713
] 

Mohit Sabharwal commented on HIVE-14187:
----------------------------------------

Thanks [~vgumashta]. I took a quick look at HIVE-7353 and had a question. Regarding the comment
"the threadpools keep a certain number of threads live and kill excess threads after a configurable
keepAliveTime expires ... ideally this is where we'd plug in code to close the JDOPersistanceManager"
-- this applies only to those threads where JDOPersistanceManager#close was not already called,
correct ? 

In the remote metastore case, deleteContext() will get executed when the underling transport
gets closed for that thread (in WorkerProcess Runnable run() method). 
https://github.com/apache/thrift/blob/master/lib/java/src/org/apache/thrift/server/TThreadPoolServer.java#L300
In this patch, we call cleanupRawStore inside deleteContext() (which calls JDOPersistanceManager#close)
(If the client explicitly already called IMetaStoreClient#close on that connection, then the
deleteContext() triggered cleanup will be a no-op.) 

I haven't looked at ThreadPoolExecutor code, but docs say "even core threads are initially
created and started only when new tasks arrive", 
i.e. threads won't get pre-created (only to be killed later after keepalive timeout), but
will get created when request arrives (which means
there will be a transport associated with it, which will trigger deleteContext() upon connection
break). IOW, thread goes back to the pool
only after the deleteContext is called. And subsequently may get killed after keepalive timeout.

So, unless someone manually kills the thread, it seems to me that deleteContext() covers the
cleanup. (deleteContext() call is in the finally
block so that covers errors/exceptions). What do you think ? 


> JDOPersistenceManager objects remain cached if MetaStoreClient#close is not called
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-14187
>                 URL: https://issues.apache.org/jira/browse/HIVE-14187
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Mohit Sabharwal
>            Assignee: Mohit Sabharwal
>         Attachments: HIVE-14187.patch
>
>
> JDOPersistenceManager objects are cached in JDOPersistenceManagerFactory by DataNuclues.
> A new JDOPersistenceManager object gets created for every HMS thread since ObjectStore
is a thread local.
> In non-embedded metastore mode, JDOPersistenceManager associated with a thread only gets
cleaned up if IMetaStoreClient#close is called by the client (which calls ObjectStore#shutdown
which calls JDOPersistenceManager#close which in turn removes the object from cache in JDOPersistenceManagerFactory#releasePersistenceManager
> https://github.com/datanucleus/datanucleus-api-jdo/blob/master/src/main/java/org/datanucleus/api/jdo/JDOPersistenceManagerFactory.java#L1271),
i.e. the object will remain cached if client does not call close.
> For example: If one interrupts out of hive CLI shell (instead of using 'exit;' command),
SessionState#close does not get called, and hence IMetaStoreClient#close does not get called.
> Instead of relying the client to call close, it's cleaner to automatically perform RawStore
related cleanup at the server end via deleteContext() which gets called when the server detects
a lost/closed connection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message