hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-15554) Improve JIT performance for Configuration parsing
Date Thu, 21 Jun 2018 22:17:00 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-15554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Todd Lipcon updated HADOOP-15554:
---------------------------------
    Status: Patch Available  (was: Open)

Attached patch refactors out the config-parsing code to a new inner class with a bunch of
smaller functions which are easier to compile. I also took the opportunity to make a few micro-optimizations
like avoiding construction of the confSources array in the common case that the config file
uses no "<source>" tags.

I tested the improvement by running:

{code}
for x in $(seq 1 60); do
  java -XX:+CITime -cp hadoop-common-project/hadoop-common/target/hadoop-common-3.2.0-SNAPSHOT.jar:$CP
\
         org.apache.hadoop.examples.ExampleDriver pi 1 1 2>&1  \
       | grep 'Total comp'
done | tee /tmp/patch.txt
{code}

to measure the total compilation time in a simple LocalJobRunner MR job. I grepped out the
times and ran a t-test using R:

{code}
data:  d.orig and d.patched
t = 36.511, df = 110.1, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 0.7329980 0.8171354
sample estimates:
mean of x mean of y
 3.508300  2.733233
{code}

So this saves about 730-810ms of CPU time spent by the JIT.

To test throughput, I used the ConfTest.java program from HADOOP-14216.

{code}
orig:

duration: 20745 count: 3561000

real    0m21.104s
user    0m29.296s
sys     0m1.903s

patch:

duration: 21810 count: 3561000

real    0m22.304s
user    0m27.013s
sys     0m2.547s
{code}

So it seems around the same - a bit less user time, a bit longer real time. Close enough to
call "not a regression".

I also tried 'fs -ls hdfs://nn/' under 'perf stat -r10':

{code}
orig:


       5295.930635      task-clock (msec)         #    3.454 CPUs utilized            ( +-
 3.56% )
            10,977      context-switches          #    0.002 M/sec                    ( +-
 0.37% )
               613      cpu-migrations            #    0.116 K/sec                    ( +-
 2.28% )
            86,804      page-faults               #    0.016 M/sec                    ( +-
 0.12% )
    14,823,251,627      cycles                    #    2.799 GHz                      ( +-
 3.61% )
    11,367,265,626      instructions              #    0.77  insn per cycle           ( +-
 1.81% )
     2,503,093,507      branches                  #  472.645 M/sec                    ( +-
 3.26% )
        67,066,880      branch-misses             #    2.68% of all branches          ( +-
 0.23% )

       1.533354188 seconds time elapsed                                          ( +-  0.54%
)

patch:

       5173.366209      task-clock (msec)         #    3.384 CPUs utilized            ( +-
 3.60% )
            11,160      context-switches          #    0.002 M/sec                    ( +-
 1.32% )
               630      cpu-migrations            #    0.122 K/sec                    ( +-
 2.82% )
            87,732      page-faults               #    0.017 M/sec                    ( +-
 0.18% )
    14,495,009,185      cycles                    #    2.802 GHz                      ( +-
 3.55% )
    11,485,553,655      instructions              #    0.79  insn per cycle           ( +-
 1.80% )
     2,487,385,519      branches                  #  480.806 M/sec                    ( +-
 3.34% )
        68,583,976      branch-misses             #    2.76% of all branches          ( +-
 0.25% )

       1.528788291 seconds time elapsed                                          ( +-  0.62%
)
{code}

 and 'yarn application -list' on an RM running no applications:

{code}
orig:
       2150.752819      task-clock (msec)         #    2.101 CPUs utilized            ( +-
 0.89% )
             9,179      context-switches          #    0.004 M/sec                    ( +-
 0.66% )
               476      cpu-migrations            #    0.221 K/sec                    ( +-
 3.20% )
            46,036      page-faults               #    0.021 M/sec                    ( +-
 0.13% )
     5,928,445,661      cycles                    #    2.756 GHz                      ( +-
 0.98% )
     6,382,601,882      instructions              #    1.08  insn per cycle           ( +-
 0.61% )
     1,153,880,261      branches                  #  536.501 M/sec                    ( +-
 0.60% )
        47,370,186      branch-misses             #    4.11% of all branches          ( +-
 0.65% )

       1.023657616 seconds time elapsed                                          ( +-  0.59%
)


patch:

       2106.716373      task-clock (msec)         #    2.091 CPUs utilized            ( +-
 0.70% )
             9,113      context-switches          #    0.004 M/sec                    ( +-
 0.62% )
               451      cpu-migrations            #    0.214 K/sec                    ( +-
 1.46% )
            47,218      page-faults               #    0.022 M/sec                    ( +-
 0.09% )
     5,769,853,936      cycles                    #    2.739 GHz                      ( +-
 0.73% )
     6,320,641,188      instructions              #    1.10  insn per cycle           ( +-
 0.31% )
     1,141,174,880      branches                  #  541.684 M/sec                    ( +-
 0.31% )
        46,945,771      branch-misses             #    4.11% of all branches          ( +-
 0.40% )

       1.007474613 seconds time elapsed                                          ( +-  0.50%
)
{code}

So it seems a slight saving in cycles for both of those applications.

> Improve JIT performance for Configuration parsing
> -------------------------------------------------
>
>                 Key: HADOOP-15554
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15554
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf, performance
>    Affects Versions: 3.0.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Minor
>         Attachments: hadoop-15554.patch
>
>
> In investigating a performance regression for small tasks between Hadoop 2 and Hadoop
3, we found that the amount of time spent in JIT was significantly higher. Using jitwatch
we were able to determine that, due to a combination of switching from DOM to SAX style parsing
and just having more configuration key/value pairs, Configuration.loadResource is now getting
compiled with the C2 compiler and taking quite some time. Breaking that very large function
up into several smaller ones and eliminating some redundant bits of code improves the JIT
performance measurably.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message