spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Grover (JIRA)" <>
Subject [jira] [Commented] (SPARK-18535) Redact sensitive information from Spark logs and UI
Date Tue, 22 Nov 2016 00:30:58 GMT


Mark Grover commented on SPARK-18535:

I just issued a PR for this, that adds a new customizable property for determining what configuration
properties are sensitive. Attached is an image from the UI with this change.
Here's the text in the YARN logs, with this change:
{{HADOOP_CREDSTORE_PASSWORD -> *********(redacted)}}

Here's the text in the event logs, with this change:

> Redact sensitive information from Spark logs and UI
> ---------------------------------------------------
>                 Key: SPARK-18535
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: Web UI, YARN
>    Affects Versions: 2.1.0
>            Reporter: Mark Grover
>         Attachments: redacted.png
> A Spark user may have to provide a sensitive information for a Spark configuration property,
or a source out an environment variable in the executor or driver environment that contains
sensitive information. A good example of this would be when reading/writing data from/to S3
using Spark. The S3 secret and S3 access key can be placed in a [hadoop credential provider|].
However, one still needs to provide the password for the credential provider to Spark, which
is typically supplied as an environment variable to the driver and executor environments.
This environment variable shows up in logs, and may also show up in the UI.
> 1. For logs, it shows up in a few places:
>   1A. Event logs under {{SparkListenerEnvironmentUpdate}} event.
>   1B. YARN logs, when printing the executor launch context.
> 2. For UI, it would show up in the _Environment_ tab, but it is redacted if it contains
the words "password" or "secret" in it. And, these magic words are [hardcoded|]
and hence not customizable.
> This JIRA is to track the work to make sure sensitive information is redacted from all
logs and UIs in Spark, while still being passed on to all relevant places it needs to get
passed on to.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message