hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-11357) Secure Delete
Date Sun, 22 Jan 2017 02:47:26 GMT

     [ https://issues.apache.org/jira/browse/HDFS-11357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Andrew Purtell updated HDFS-11357:
    Attachment: 0001-HDFS-secure-delete.patch

For now, attaching a proof of concept developed against a slightly patched version of branch-2.7.

Introduces a site file configuration option that enables a global delete-with-overwrite of
block files. When DataNodes are asked to delete blocks by the NameNode today they simply do
an unlink. With the new configuration toggle enabled, the DNs overwrite the block file with
zeros or pseudorandom data before unlinking it.

> Secure Delete
> -------------
>                 Key: HDFS-11357
>                 URL: https://issues.apache.org/jira/browse/HDFS-11357
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>            Priority: Minor
>         Attachments: 0001-HDFS-secure-delete.patch
> Occasionally for compliance or other legal/process reasons it is necessary to attest
that data has been deleted in such a way that it cannot be retrieved even through low level
forensics (for some reasonable definition of this that typically excludes the resources a
state actor can bring to data recovery). HDFS at-rest encryption offers one way to achieve
this, if the data keying strategy is highly granular. One simply "forgets" a key corresponding
to a given set of files and the data becomes irretrievable. However if HDFS at-rest encryption
is not enabled or a fine grained keying strategy is not possible, another simple strategy
can be employed. 
> The objective is to ensure once a block is deleted no trace of the data within the block
exists on disk in unallocated regions, for all blocks, providing assurance deleted data cannot
be recovered at any time through reasonable effort even with low level access. 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message