hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-15557) CryptoInputStream can't handle concurrent access; inconsistent with HDFS
Date Tue, 26 Jun 2018 14:07:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16523774#comment-16523774

Steve Loughran commented on HADOOP-15557:

yes, its the forgivingness of hdfs which has led people to use it this way, even though the
java.io docs very much say "do not use concurrently".

> CryptoInputStream can't handle concurrent access; inconsistent with HDFS
> ------------------------------------------------------------------------
>                 Key: HADOOP-15557
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15557
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 3.2.0
>            Reporter: Todd Lipcon
>            Priority: Major
> In general, the non-positional read APIs for streams in Hadoop Common are meant to be
used by only a single thread at a time. It would not make much sense to have concurrent multi-threaded
access to seek+read because they modify the stream's file position. Multi-threaded access
on input streams can be done using positional read APIs. Multi-threaded access on output streams
probably never makes sense.
> In the case of DFSInputStream, the positional read APIs are marked synchronized, so that
even when misused, no strange exceptions are thrown. The results are just somewhat undefined
in that it's hard for a thread to know which position was read from. However, when running
on an encrypted file system, the results are much worse: since CryptoInputStream's read methods
are not marked synchronized, the caller can get strange ByteBuffer exceptions or even a JVM
crash due to concurrent use and free of underlying OpenSSL Cipher buffers.
> The crypto stream wrappers should be made more resilient to such misuse, for example
> (a) making the read methods safer by making them synchronized (so they have the same
behavior as DFSInputStream)
> or
> (b) trying to detect concurrent access to these methods and throwing ConcurrentModificationException
so that the user is alerted to their probable misuse.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message