hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-10899) Add functionality to re-encrypt EDEKs.
Date Fri, 04 Nov 2016 00:54:59 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15634823#comment-15634823

Andrew Wang commented on HDFS-10899:

Thanks for working on this Xiao! I didn't give the whole patch a deep look, but some review
comments to start it off. I didn't get to the KMS changes, it'd help to split those into a
separate patch to make review easier.

Comments as follows:

* getShortUsage, are -path / -cancel / -verify exclusive operations? We should indicate this
in the usage text if so. {{man git}} for instance looks like:

usage: git [--version] [--help] [-C <path>] [-c name=value]
           [--exec-path[=<path>]] [--html-path] [--man-path] [--info-path]
           [-p | --paginate | --no-pager] [--no-replace-objects] [--bare]
           [--git-dir=<path>] [--work-tree=<path>] [--namespace=<name>]
           <command> [<args>]

So {{<command> <path>}} would work here, with the long usage explaining the commands.

* We have new functionality to specify time with units in configuration, want to use it here?
* New keys should be documented in hdfs-default.xml also


* Some unused imports in I'm sure checkstyle will complain about
* Typo: "filesReencryptedInCurentZone" -> "Current"
* Can the new methods in EZManager be encapsulated as a class? Could use some javadoc about
expected behavior also.
* Empty {{finally}} case in the run method, can be deleted?
* Thread should check if the NN is active or standby, typically we do this in FSNamesystem#startActiveServices
and FSNamesystem#stopActiveServices
* reencryptDir looks like it takes the writeLock the entire time while it's processing the
files in a directory, including when talking to the KMS. We should not hold the lock while
talking to the KMS.
* reencryptDir is also a big method, refactor the innards of the for loop?
* We also should do a logSync before we log the "Completed" message in {{reencryptEncryptionZoneInt}},
so we block first. Maybe elsewhere too.
* Calling {{getListing}} will generate audit logs, we should be calling {{getListingInt}}
instead. Also, we don't need a full {{HdfsFileStatus}} to reencrypt each file, so let's consider
an INode-based traversal.
* Structuring this as an iterator (and maybe also a visitor) would make the code more clear.
* Recommend we rename {{updateReencryptStatus}} so it's clear that this is used during edit
log / fsimage loading, or at least add some javadoc.

* Recommend we name ReencryptInfo something like "PendingReencryptionZones" or something to
be more self documenting. The javadoc also could mention that this is basically a queue.
* A small warning that {{hasZone}} will be {{O(n)}} if the JDK implementation is truly a queue,
maybe we should use a LinkedHashMap instead? I don't think we use the deque nature.
* How is the reencryptInfo in EZManager synchronized? I ask because it's a ConcurrentLinkedDeque,
is there actually concurrency happening?

> Add functionality to re-encrypt EDEKs.
> --------------------------------------
>                 Key: HDFS-10899
>                 URL: https://issues.apache.org/jira/browse/HDFS-10899
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: encryption, kms
>            Reporter: Xiao Chen
>            Assignee: Xiao Chen
>         Attachments: HDFS-10899.wip.patch, Re-encrypt edek design doc.pdf
> Currently when an encryption zone (EZ) key is rotated, it only takes effect on new EDEKs.
We should provide a way to re-encrypt EDEKs after the EZ key rotation, for improved security.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message