hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Drome (JIRA)" <>
Subject [jira] [Commented] (HIVE-17853) RetryingMetaStoreClient loses UGI impersonation-context when reconnecting after timeout
Date Tue, 31 Oct 2017 18:46:00 GMT


Chris Drome commented on HIVE-17853:

[~vihangk1], as per the description, consider the case of Oozie {{oozie}} impersonating a
different user {{mithun}}. The {{oozie}} user will create a client and open the connection
to the metastore within the doAs clause, which means that all operations during this session
are performed as {{mithun}}.

A retry/reconnect can occur if the read timeout for an operation is exceeded or the lifetime
of the connection is exceeded. At this point, {{close}} is called explicitly, followed by
a call to {{open}} to establish a new connection. However, the reconnect call is not being
performed in a doAs context, so it will create a new connection to the metastore as {{oozie}}.

There is no specific stack trace to attach here as it depends on the operations executed after
the reconnect, and typically manifests as a failure caused by insufficient privileges. Worst
case, if {{oozie}} has more privileges than {{mithun}}, it will successfully perform operations
that {{mithun}} is not allowed to perform.

According to the API, fetching the UserGroupInformation object can throw an IOException. I'm
not familiar with the cases under which this would occur. However, I didn't want to fail immediately,
because if the connection was initially established within a doAs, the calling code should
have been able to establish a proper identity. So I let as much work get accomplished until
the reconnect fails, which shouldn't be a problem, because most metastore sessions are not

> RetryingMetaStoreClient loses UGI impersonation-context when reconnecting after timeout
> ---------------------------------------------------------------------------------------
>                 Key: HIVE-17853
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Metastore
>    Affects Versions: 3.0.0, 2.4.0, 2.2.1
>            Reporter: Mithun Radhakrishnan
>            Assignee: Chris Drome
>            Priority: Critical
>         Attachments: HIVE-17853.01-branch-2.2.patch, HIVE-17853.01-branch-2.patch, HIVE-17853.01.patch
> The {{RetryingMetaStoreClient}} is used to automatically reconnect to the Hive metastore,
after client timeout, transparently to the user.
> In case of user impersonation (e.g. Oozie super-user {{oozie}} impersonating a Hadoop
user {{mithun}}, to run a workflow), in case of timeout, we find that the reconnect causes
the {{UGI.doAs()}} context to be lost. Any further metastore operations will be attempted
as the login-user ({{oozie}}), as opposed to the effective user ({{mithun}}).
> We should have a fix for this shortly.

This message was sent by Atlassian JIRA

View raw message