hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10603) Crypto input and output streams implementing Hadoop stream interfaces
Date Wed, 21 May 2014 01:29:39 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004190#comment-14004190
] 

Andrew Wang commented on HADOOP-10603:
--------------------------------------

Hi Yi, thanks for working on this code. I have some review comments based on a slightly old
version of this patch. Throughout, if I've correctly identified a correctness issue, let's
have a corresponding unit test if possible. I'd personally also rather not split the implementation
and tests into separate JIRAs.

Nitty stuff:
* Need class javadoc and interface annotations on all new classes
* Need "<p/>" to actually line break in javadoc
* Some tab characters present

CryptoUtils
* s/mod/mode
* What does "calIV" mean? Javadoc here would be nice.
* calIV would be simpler if we used ByteBuffer.wrap and {{getLong}}. I think right now, we
also need to cast each value to a {{long}} before shifting, else it only works up to an int.
Would be good to unit test this function.

Encryptor/JCEAESCTREncryptor:
* Could you define the term "block" in the {{#encrypt}} javadoc?
* I don't understand the reinit conditions, do you mind explaining this a bit? The javadoc
for {{Cipher#update}} indicates that it always fully reads the input buffer, so is the issue
that the cipher sometimes doesn't flush all the input to the output buffer?
* If this API only accepts direct ByteBuffers, we should Precondition check that in the implementation
* Javadoc for {{encrypt}} should link to {{javax.crypto.ShortBufferException}}, not {{#ShortBufferException}}.
I also don't see this being thrown because we wrap everything in an IOException.

CryptoOutputStream:
* How was the default buffer size of 8KB chosen? This should probably be a new configuration
parameter, or respect io.file.buffer.size.
* Potential for int overflow in {{#write}} where we check {{off+len < 0}}. I also find
this if statement hard to parse, would prefer if it were expanded.
* Is the {{16}} in updateEncryptor something that should be hard-coded? Maybe pull it out
into a constant and javadoc why it's 16. I'm curious if this is dependent on the Encryptor
implementation.
* We need to be careful with direct BBs, since they don't trigger GC. We should be freeing
them manually when the stream is closed, or pooling them somehow for reuse.
* In {{#process}}, we flip the inBuf, then if there's no data we just return. Shouldn't we
restore inBuf to its previous padded state first? Also, IIUC {{inBuffer.remaining()}} cannot
be less than padding since the inBuffer position does not move backwards, so I'd prefer to
see a Precondition check and {{inBuf.remaining() == padding)}}. Test case would be nice if
I'm right about this.
* Rename {{#process}} to {{#encrypt}}?
* Do we need the special-case logic with tmpBuf? It looks like outBuffer is always direct.
* Do we need to update padding when we do a flush?
* Also in {{#flush}}, s/encryption/encrypt
* oneByte can be final
* If you have extra time, an ASCII art diagram showing how {{padding}} and the stream offset
works would also be nice. Javadoc for the special padding handling would be nice.
* Can make class-private methods private
* Should {{close()}} also close the underlying stream?

CryptoInputStream:
* Many of the OutputStream comments apply here too: hardcoded 8KB buffer size, int overflow
for {{off+len}}, hardcoded 16, handling of padding in {{process}}, freeing direct buffers,
private methods, closing underlying stream, etc.
* Do we have tests for wrapping both ByteBufferReadable and not streams?
* Rename {{process()}} to {{decrypt()}}?
* In {{process}}, the catch/throw there doesn't seem useful, since everything that throws
already throws an IOException.
* Positioned read and readFully, doing a seek in the finally will not work on a non-Seekable
stream. There's also no need to catch {{ClassCastException}} since it's already handled in
{{seek}}.
* {{readFully(long, byte[])}} should just delegate directly to the other readFully method,
it doesn't need to do anything else.
* updateDecryptor doesn't seem to need the {{long offset}} parameter since it's always passed
{{streamOffset}}.
* We need to return -1 on EOF for zero-byte reads, see HDFS-5762.
* Comment in {{skip}} about why we subtract then add {{outBuffer.remaining()}} would be good.
* Some empty {{throw new UnsupportedOperationException()}} could use text

Decryptor:
* s/if initialize fails/if initialization fails/

> Crypto input and output streams implementing Hadoop stream interfaces
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-10603
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10603
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: security
>    Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
>            Reporter: Alejandro Abdelnur
>            Assignee: Yi Liu
>             Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)
>
>         Attachments: HADOOP-10603.1.patch, HADOOP-10603.2.patch, HADOOP-10603.3.patch,
HADOOP-10603.4.patch, HADOOP-10603.5.patch, HADOOP-10603.6.patch, HADOOP-10603.7.patch, HADOOP-10603.8.patch,
HADOOP-10603.patch
>
>
> A common set of Crypto Input/Output streams. They would be used by CryptoFileSystem,
HDFS encryption, MapReduce intermediate data and spills. Note we cannot use the JDK Cipher
Input/Output streams directly because we need to support the additional interfaces that the
Hadoop FileSystem streams implement (Seekable, PositionedReadable, ByteBufferReadable, HasFileDescriptor,
CanSetDropBehind, CanSetReadahead, HasEnhancedByteBufferAccess, Syncable, CanSetDropBehind).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message