hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei-Chiu Chuang (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HADOOP-14705) Add batched reencryptEncryptedKey interface to KMS
Date Tue, 15 Aug 2017 10:55:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16127075#comment-16127075
] 

Wei-Chiu Chuang edited comment on HADOOP-14705 at 8/15/17 10:54 AM:
--------------------------------------------------------------------

Hi [~xiaochen] thanks a lot for taking up this work. I reviewed the patch and have a few comments
listed below:
{code:title=KMSClientProvider#reencryptEncryptedKeys}
final List jsonPayload = new ArrayList();
{code}
should be final List<Map> jsonPayload = new ArrayList<Map>();

{code}
if (keyName == null) {
  checkNotNull(ekv.getEncryptionKeyName(), "ekv.getEncryptionKeyName");
{code}
The check is redundant
{code}
jsonPayload.add(KMSUtil.toJSON(ekv, ekv.getEncryptionKeyName()));
{code}
if all ekv are the same, wouldn’t it be more efficient to optimize it somehow?
{code}
final List<Map> response =
    call(conn, jsonPayload, HttpURLConnection.HTTP_OK,
List.class);
{code}
should the last parameter be Map.class?

Question: is there a practical size limit for a KMS request?


{code:title=TestKMS}
fail("Should not be able to reencryptEncryptedKeys");
{code}
—> grammatical error: Should not have been

{code:title=KMS}
kmsAudit.ok(user, KMSOp.REENCRYPT_EEK_BATCH, name, "");
{code}
I wonder if it makes sense to log the size of the batch in extraMsg.

{code:title=KMS}
if (LOG.isDebugEnabled()) {
  LOG.debug("reencryptEncryptedKeys {} keys for key {} took
{}",
      jsonPayload.size(), name, sw.stop());
}
{code}
It looks like a bad practice (for fear of resource leakage) to me that the StopWatch is only
stopped if debug log is enabled. 
Also, does it return time in milliseconds? Can you add the time unit into log message as well?

This is irrelevant to this patch, but there are a number of places in KMS where Object references
are used unnecessarily:
{code}
Object retJSON;
…
retJSON = new ArrayList();
for (EncryptedKeyVersion edek : retEdeks) {
  ((ArrayList)
retJSON).add(KMSUtil.toJSON(edek));
}

{code}



was (Author: jojochuang):
Hi [~xiaochen] thanks a lot for taking up this work. I reviewed the patch and have a few comments
listed below:
{code:title=KMSClientProvider#reencryptEncryptedKeys}
final List jsonPayload = new ArrayList();
{code}
should be final List<Map> jsonPayload = new ArrayList<Map>();

{code}
if (keyName == null) {
  checkNotNull(ekv.getEncryptionKeyName(), "ekv.getEncryptionKeyName");
{code}
The check is redundant
{code}
jsonPayload.add(KMSUtil.toJSON(ekv, ekv.getEncryptionKeyName()));
{code}
if all env are the same, wouldn’t it be more efficient to optimize it somehow?
{code}
final List<Map> response =
    call(conn, jsonPayload, HttpURLConnection.HTTP_OK,
List.class);
{code}
the type List should be Map.class

Question: is there a practical size limit for a KMS request?


{code:title=TestKMS}
fail("Should not be able to reencryptEncryptedKeys");
{code}
—> grammatical error: Should not have been

{code:title=KMS}
kmsAudit.ok(user, KMSOp.REENCRYPT_EEK_BATCH, name, "");
{code}
I wonder if it makes sense to log the size of the batch in extraMsg.

{code:title=KMS}
if (LOG.isDebugEnabled()) {
  LOG.debug("reencryptEncryptedKeys {} keys for key {} took
{}",
      jsonPayload.size(), name, sw.stop());
}
{code}
It looks like a bad practice (for fear of resource leakage) to me that the StopWatch is only
stopped if debug log is enabled. 
Also, does it return time in milliseconds? Can you add the time unit into log message as well?

This is irrelevant to this patch, but there are a number of places in KMS where Object references
are used unnecessarily:
{code}
Object retJSON;
…
retJSON = new ArrayList();
for (EncryptedKeyVersion edek : retEdeks) {
  ((ArrayList)
retJSON).add(KMSUtil.toJSON(edek));
}

{code}


> Add batched reencryptEncryptedKey interface to KMS
> --------------------------------------------------
>
>                 Key: HADOOP-14705
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14705
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: kms
>            Reporter: Xiao Chen
>            Assignee: Xiao Chen
>         Attachments: HADOOP-14705.01.patch, HADOOP-14705.02.patch, HADOOP-14705.03.patch
>
>
> HADOOP-13827 already enabled the KMS to re-encrypt a {{EncryptedKeyVersion}}.
> As the performance results of HDFS-10899 turns out, communication overhead with the KMS
occupies the majority of the time. So this jira proposes to add a batched interface to re-encrypt
multiple EDEKs in 1 call.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message