hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-6628) Potential memory leak in CryptoOutputStream
Date Tue, 23 Aug 2016 00:55:21 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15431890#comment-15431890
] 

Hadoop QA commented on MAPREDUCE-6628:
--------------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s {color} | {color:blue}
Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green}
The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color}
| {color:green} The patch appears to include 1 new or modified test files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue}
Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 40s {color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 54s {color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 29s {color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 51s {color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 43s {color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 28s {color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s {color} |
{color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s {color} | {color:blue}
Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 22s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 46s {color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 46s {color} | {color:green}
the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 29s {color} | {color:red}
root: The patch generated 32 new + 232 unchanged - 4 fixed = 264 total (was 236) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 51s {color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 42s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color}
| {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 56s {color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s {color} |
{color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} |
{color:green} hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core
generated 0 new + 2496 unchanged - 12 fixed = 2496 total (was 2508) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s {color} |
{color:green} hadoop-mapreduce-client-jobclient in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 10s {color} | {color:green}
hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 4s {color} | {color:green}
hadoop-mapreduce-client-core in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 117m 9s {color} | {color:green}
hadoop-mapreduce-client-jobclient in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s {color}
| {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 166m 32s {color} | {color:black}
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12804278/MAPREDUCE-6628.006.patch
|
| JIRA Issue | MAPREDUCE-6628 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  unit  findbugs
 checkstyle  |
| uname | Linux 86f931470f41 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12
UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 3ca4d6d |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6687/artifact/patchprocess/diff-checkstyle-root.txt
|
|  Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6687/testReport/
|
| modules | C: hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient U: . |
| Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6687/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Potential memory leak in CryptoOutputStream
> -------------------------------------------
>
>                 Key: MAPREDUCE-6628
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6628
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: security
>    Affects Versions: 2.6.4
>            Reporter: Mariappan Asokan
>            Assignee: Mariappan Asokan
>         Attachments: MAPREDUCE-6628.001.patch, MAPREDUCE-6628.002.patch, MAPREDUCE-6628.003.patch,
MAPREDUCE-6628.004.patch, MAPREDUCE-6628.005.patch, MAPREDUCE-6628.006.patch
>
>
> There is a potential memory leak in {{CryptoOutputStream.java.}}  It allocates two direct
byte buffers ({{inBuffer}} and {{outBuffer}}) that get freed when {{close()}} method is called.
 Most of the time, {{close()}} method is called.  However, when writing to intermediate Map
output file or the spill files in {{MapTask}}, {{close()}} is never called since calling so
 would close the underlying stream which is not desirable.  There is a single underlying physical
stream that contains multiple logical streams one per partition of Map output.  
> By default the amount of memory allocated per byte buffer is 128 KB and  so the total
memory allocated is 256 KB,  This may not sound much.  However, if the number of partitions
(or number of reducers) is large (in the hundreds) and/or there are spill files created in
{{MapTask}}, this can grow into a few hundred MB. 
> I can think of two ways to address this issue:
> h2. Possible Fix - 1
> According to JDK documentation:
> {quote}
> The contents of direct buffers may reside outside of the normal garbage-collected heap,
and so their impact upon the memory footprint of an application might not be obvious.  It
is therefore recommended that direct buffers be allocated primarily for large, long-lived
buffers that are subject to the underlying system's native I/O operations.  In general it
is best to allocate direct buffers only when they yield a measureable gain in program performance.
> {quote}
> It is not clear to me whether there is any benefit of allocating direct byte buffers
in {{CryptoOutputStream.java}}.  In fact, there is a slight CPU overhead in moving data from
{{outBuffer}} to a temporary byte array as per the following code in {{CryptoOutputStream.java}}.
> {code}
>     /*
>      * If underlying stream supports {@link ByteBuffer} write in future, needs
>      * refine here. 
>      */
>     final byte[] tmp = getTmpBuf();
>     outBuffer.get(tmp, 0, len);
>     out.write(tmp, 0, len);
> {code}
> Even if the underlying stream supports direct byte buffer IO (or direct IO in OS parlance),
it is not clear whether it will yield any measurable performance gain.
> The fix would be to allocate a ByteBuffer on the heap for inBuffer and wrap a byte array
in a {{ByteBuffer}} for {{outBuffer}}.  By the way, the {{inBuffer}} and {{outBuffer}} have
to be {{ByteBuffer}} as demanded by the {{encrypt()}} method in {{Encryptor}}.
> h2. Possible Fix - 2
> Assuming that we want to keep the buffers as direct byte buffers, we can create a new
constructor to {{CryptoOutputStream}} and pass a boolean flag {{ownOutputStream}} to indicate
whether the underlying stream will be owned by {{CryptoOutputStream}}. If it is true, then
calling the {{close()}} method will close the underlying stream.  Otherwise, when {{close()}}
is called only the direct byte buffers will be freed and the underlying stream will not be
closed.
> The scope of changes for this fix will be somewhat wider.  We need to modify {{MapTask.java}},
{{CryptoUtils.java}}, and {{CryptoFSDataOutputStream.java}} as well to pass the ownership
flag mentioned above.
> I can post a patch for either of the above.  I welcome any other ideas from developers
to fix this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org


Mime
View raw message