hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12574) Add CryptoInputStream to WebHdfsFileSystem read call.
Date Thu, 14 Dec 2017 18:25:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16291301#comment-16291301
] 

Daryn Sharp commented on HDFS-12574:
------------------------------------

*Main issues*
* Signature of {{Hdfs#open}} can’t be changed.  It’s an incompatible to a stable api.
* {{CryptoProtocolVersion.valueOf}} should return the UNKNOWN value rather than null.
* Remove {{CryptoProtocolVersion.fromVersionId}} and call {{valueOf}} after making prior change.
 Namely, we can’t throw an exception that will cause {{JsonUtilClient#toFileStatus}} to
fail just because the client doesn’t understand a newer protocol version.  Otherwise you
can’t even list a directory.

*WebHdfsFileSystem*
* {{AbstractFsPathRunner}} isn’t the right place for storing feInfo.  It should be localized
to the read runner which is the only op that needs it.
* Too much redundancy.  {{createWrappedInputStream}} is virtually identical to the same method
in {{DFSInputStream}}.  {{decryptEncryptedDataEncryptionKey}} is virtually identical to {{DFSClient}}.
 Looks like a new method that takes a key provider, and feinfo, and an input stream would
suffice.
* Need to avoid the double/redundant file status call.   As evident by the changes to the
audit log and token tests.  No reason to call file status for feInfo and again for size.


*Minor*
* {{EZ_HEADER}} is in all caps which isn’t the standard for http headers.  It’s first
letter cap and usually “X-<NAMESPACE>”) so I’d call it something like “X-Hadoop-Accept-EZ”.
* {{MiniDFSCluster}} change to reset webhdfs server defaults is exposing internal implementation
details.  The cached server defaults should probably be in the {{ServletContext}} so it’s
not persisted.
* {{FileEncryptionInfo}} does it really need the equals/hashcode?  Not a big deal, just a
lot of gnarly code.

*Tests*
* {{TestEncryptionZonesWithKMS}} doesn’t seem like the best place to put all these webhdfs
tests.  It’s currently testing tokens and other stuff, not encrypt/decrypt.  Might make
sense to see if it can go into the “contact” tests if there is one for EZ.
* Need to test that seek/pread actually works.  Contract tests used by hdfs should cover that.
* Not sure why it’s decoding json itself instead of using the new json util methods you
added.
* {{verifyStreamsSame}}  should just use {{DFSTestUtil#readFileBuffer}} and {{assertArrayEquals}}
for comparison. Will make the next easier:
* {{verifyStreamsDifferent}} shouldn’t rely on individual bytes always being different in
the encrypted/unencrypted arrays.  Bulk array comparison avoids the possible yet unlikely
issue (think “flaky” test)




> Add CryptoInputStream to WebHdfsFileSystem read call.
> -----------------------------------------------------
>
>                 Key: HDFS-12574
>                 URL: https://issues.apache.org/jira/browse/HDFS-12574
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: encryption, kms, webhdfs
>            Reporter: Rushabh S Shah
>            Assignee: Rushabh S Shah
>         Attachments: HDFS-12574.001.patch, HDFS-12574.002.patch, HDFS-12574.003.patch,
HDFS-12574.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message