hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thejas M Nair (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-20644) Avoid exposing sensitive infomation through a Hive Runtime exception
Date Wed, 03 Oct 2018 20:05:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-20644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637463#comment-16637463
] 

Thejas M Nair commented on HIVE-20644:
--------------------------------------

For completeness, maybe make similar update in ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkReduceRecordHandler.java
as well ?

Also, DEBUG level is often used for debugging variety of problems, and people can set entire
HS2 logging at that level. Sometimes production systems can be put in DEBUG level logging
for sometime while troubleshooting an issue. AFAIK, debug level messages in HS2 would be sent
to beeline if HS2 is running in debug mode and user has set hive.server2.logging.operation.level=verbose.
(Since this error is happening in the tasks, its possible that this doesn't get sent currently.
But again, that can change in future)

I think its unusual to have production systems running with TRACE level logs, so maybe log
the rowString part only in trace level logging ? I would expect only specific classes to be
enabled at trace level logging, so that would be a safer option IMO.

 

 

> Avoid exposing sensitive infomation through a Hive Runtime exception
> --------------------------------------------------------------------
>
>                 Key: HIVE-20644
>                 URL: https://issues.apache.org/jira/browse/HIVE-20644
>             Project: Hive
>          Issue Type: Improvement
>          Components: HiveServer2
>    Affects Versions: 3.1.0
>            Reporter: Ashutosh Bapat
>            Assignee: Ashutosh Bapat
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 3.1.0
>
>         Attachments: HIVE-20644.01, HIVE-20644.02, HIVE-20644.03
>
>
> The HiveException raised from the following methods is exposing the datarow the caused
the run time exception.
>  # ReduceRecordSource::GroupIterator::next() - around line 372
>  # MapOperator::process() - around line 567
>  # ExecReducer::reduce() - around line 243
> In all the cases, a string representation of the row is constructed on the fly and is
included in
> the error message.
> VectorMapOperator::process() - around line 973 raises the same exception but it's not
exposing the row since the row contents are not included in the error message.
> While trying to reproduce above error, I also found that the arguments to a UDF get exposed
in log messages from FunctionRegistry::invoke() around line 1114. This too can cause sensitive
information to be leaked through error message.
> This way some sensitive information is leaked to a user through exception message. That
information may not be available to the user otherwise. Hence it's a kind of security breach
or violation of access control.
> The contents of the row or the arguments to a function may be useful for debugging and
hence it's worth to add those to logs. Hence proposal here to log a separate message with
log level DEBUG or INFO containing the string representation of the row. Users can configure
their logging so that DEBUG/INFO messages do not go to the client but at the same time are
available in the hive server logs for debugging. The actual exception message will not contain
any sensitive data like row data or argument data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message