hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prasad Chakka (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-1219) More robust handling of metastore connection failures
Date Thu, 18 Mar 2010 18:26:28 GMT

    [ https://issues.apache.org/jira/browse/HIVE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12847051#action_12847051
] 

Prasad Chakka commented on HIVE-1219:
-------------------------------------

@paul, makes sense to add separator in the constants. otherwise people are bound to make mistakes.

ObjectStore.java
# good idea to add a lock. But only the first thread that encountered problems with its PersistentManager
object should try to recreate PersistenetManagerFactory and not the subsequent threads. May
be you can create a new PMF only when PM's reference to PMF match and otherwise. Though I
doubt how much of this will be real problem for metastore server.

HiveMetastore.java
# updateConnectionURL() throw back the exception instead of just logging and returning false.
Also there is no need to log when rethrowing the exception. Caller would log it anyways if
needed (in initConnectionUrlHook)

Do you reload hive conf on every retry?


Otherwise patch looks good.

> More robust handling of metastore connection failures
> -----------------------------------------------------
>
>                 Key: HIVE-1219
>                 URL: https://issues.apache.org/jira/browse/HIVE-1219
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Metastore
>            Reporter: Paul Yang
>            Assignee: Paul Yang
>             Fix For: 0.6.0
>
>         Attachments: HIVE-1219.1.patch, HIVE-1219.2.patch, HIVE-1219.3.patch, HIVE-1219.4.patch,
HIVE-1219.5.patch
>
>
> Currently, if metastore's connection to the datastore is broken, the query fails and
the exception such as the following is thrown
> {code}
> 2010-01-28 11:50:20,885 ERROR exec.MoveTask (SessionState.java:printError(248)) - Failed
with exception Unable to fetch table tmp_table
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to fetch table tmp_table
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:362)
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:333)
> at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:112)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:99)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:64)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:582)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:462)
> at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:324)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
> at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:200)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:256)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Caused by: javax.jdo.JDODataStoreException: Communications link failure
> Last packet sent to the server was 1 ms ago.
> NestedThrowables:
> com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
> Last packet sent to the server was 1 ms ago.
> at org.datanucleus.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:289)
> {code}
> In order to reduce the impact of transient network issues and momentarily unavailable
datastores, two possible improvements are:
> 1. Retrying the metastore command in case of connection failure before propagating up
the exception.
> 2. Retrieving the datastore hostname / connection URL through the use of an extension.
This extension would be useful in the case where a remote service maintained the location
of the currently available datastore. In case of hostname changes or failovers to a backup
datastore, the extension would allow hive clients to run without manual intervention.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message