hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "genericqa (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-13752) fs.Path stores file path in java.net.URI causes big memory waste
Date Wed, 22 Aug 2018 16:11:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-13752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589074#comment-16589074
] 

genericqa commented on HDFS-13752:
----------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 39s{color} | {color:blue}
Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  0s{color} |
{color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  0s{color} | {color:red}
The patch doesn't appear to include any new or modified tests. Please justify why no new tests
are needed for this patch. Also please list what manual steps were performed to verify this
patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 35s{color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 30s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 53s{color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 16s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 37s{color}
| {color:green} branch has no errors when building and testing our client artifacts. {color}
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 45s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  1s{color} |
{color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 58s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 52s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 52s{color} | {color:green}
the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  0m 55s{color}
| {color:orange} hadoop-common-project/hadoop-common: The patch generated 11 new + 11 unchanged
- 12 fixed = 22 total (was 23) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 31s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m  0s{color}
| {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 13s{color}
| {color:green} patch has no errors when building and testing our client artifacts. {color}
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  7s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 58s{color} |
{color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m 26s{color} | {color:green}
hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 39s{color}
| {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 99m 41s{color} | {color:black}
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-13752 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12936662/HDFS-13752.003.patch
|
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  mvnsite  unit
 shadedclient  findbugs  checkstyle  |
| uname | Linux fcedd32295b2 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018
x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 8184739 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/24835/artifact/out/diff-checkstyle-hadoop-common-project_hadoop-common.txt
|
|  Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/24835/testReport/ |
| Max. process+thread count | 1347 (vs. ulimit of 10000) |
| modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
|
| Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/24835/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> fs.Path stores file path in java.net.URI causes big memory waste
> ----------------------------------------------------------------
>
>                 Key: HDFS-13752
>                 URL: https://issues.apache.org/jira/browse/HDFS-13752
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 2.7.6
>         Environment: Hive 2.1.1 and hadoop 2.7.6 
>            Reporter: Barnabas Maidics
>            Priority: Major
>         Attachments: HDFS-13752.001.patch, HDFS-13752.002.patch, HDFS-13752.003.patch,
Screen Shot 2018-07-20 at 11.12.38.png, heapdump-100000partitions.html, measurement.pdf
>
>
> I was looking at HiveServer2 memory usage, and a big percentage of this was because
of org.apache.hadoop.fs.Path, where you store file paths in a java.net.URI object. The URI
implementation stores the same string in 3 different objects (see the attached image). In
Hive when there are many partitions this cause a big memory usage. In my particular case 42%
of memory was used by java.net.URI so it could be reduced to 14%. 
> I wonder if the community is open to replace it with a more memory efficient implementation
and what other things should be considered here? It can be a huge memory improvement for
Hadoop and for Hive as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message