Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3C19218D0A for ; Mon, 10 Aug 2015 20:46:46 +0000 (UTC) Received: (qmail 72115 invoked by uid 500); 10 Aug 2015 20:46:45 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 72061 invoked by uid 500); 10 Aug 2015 20:46:45 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 72048 invoked by uid 99); 10 Aug 2015 20:46:45 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Aug 2015 20:46:45 +0000 Date: Mon, 10 Aug 2015 20:46:45 +0000 (UTC) From: "Andrew Wang (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-8747) Provide Better "Scratch Space" and "Soft Delete" Support for HDFS Encryption Zones MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-8747?page=3Dcom.atlassian.= jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D14680= 736#comment-14680736 ]=20 Andrew Wang commented on HDFS-8747: ----------------------------------- I was thinking the root zone is created by an admin when / is empty, just l= ike any other encryption zone. This means the admin needs to do this up fro= nt when the cluster is freshly formatted. > Provide Better "Scratch Space" and "Soft Delete" Support for HDFS Encrypt= ion Zones > -------------------------------------------------------------------------= --------- > > Key: HDFS-8747 > URL: https://issues.apache.org/jira/browse/HDFS-8747 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption > Affects Versions: 2.6.0 > Reporter: Xiaoyu Yao > Assignee: Xiaoyu Yao > Attachments: HDFS-8747-07092015.pdf, HDFS-8747-07152015.pdf, HDFS= -8747-07292015.pdf > > > HDFS Transparent Data Encryption At-Rest was introduced in Hadoop 2.6 to = allow create encryption zone on top of a single HDFS directory. Files under= the root directory of the encryption zone will be encrypted/decrypted tran= sparently upon HDFS client write or read operations.=20 > Generally, it does not support rename(without data copying) across encryp= tion zones or between encryption zone and non-encryption zone because diffe= rent security settings of encryption zones. However, there are certain use = cases where efficient rename support is desired. This JIRA is to propose be= tter support of two such use cases =E2=80=9CScratch Space=E2=80=9D (a.k.a. = staging area) and =E2=80=9CSoft Delete=E2=80=9D (a.k.a. trash) with HDFS en= cryption zones. > =E2=80=9CScratch Space=E2=80=9D is widely used in Hadoop jobs, which requ= ires efficient rename support. Temporary files from MR jobs are usually sto= red in staging area outside encryption zone such as =E2=80=9C/tmp=E2=80=9D = directory and then rename to targeted directories as specified once the dat= a is ready to be further processed.=20 > Below is a summary of supported/unsupported cases from latest Hadoop: > * Rename within the encryption zone is supported > * Rename the entire encryption zone by moving the root directory of the z= one is allowed. > * Rename sub-directory/file from encryption zone to non-encryption zone i= s not allowed. > * Rename sub-directory/file from encryption zone A to encryption zone B i= s not allowed. > * Rename from non-encryption zone to encryption zone is not allowed. > =E2=80=9CSoft delete=E2=80=9D (a.k.a. trash) is a client-side =E2=80=9Cso= ft delete=E2=80=9D feature that helps prevent accidental deletion of files = and directories. If trash is enabled and a file or directory is deleted usi= ng the Hadoop shell, the file is moved to the .Trash directory of the user'= s home directory instead of being deleted. Deleted files are initially mov= ed (renamed) to the Current sub-directory of the .Trash directory with orig= inal path being preserved. Files and directories in the trash can be restor= ed simply by moving them to a location outside the .Trash directory. > Due to the limited rename support, delete sub-directory/file within encry= ption zone with trash feature is not allowed. Client has to use -skipTrash = option to work around this. HADOOP-10902 and HDFS-6767 improved the error m= essage but without a complete solution to the problem.=20 > We propose to solve the problem by generalizing the mapping between encry= ption zone and its underlying HDFS directories from 1:1 today to 1:N. The e= ncryption zone should allow non-overlapped directories such as scratch spac= e or soft delete "trash" locations to be added/removed dynamically after cr= eation. This way, rename for "scratch space" and "soft delete" can be bette= r supported without breaking the assumption that rename is only supported "= within the zone".=20 -- This message was sent by Atlassian JIRA (v6.3.4#6332)