hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "SammiChen (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HADOOP-15499) Performance several drop when running RawErasureCoderBenchmark with NativeRSRawErasureCoder
Date Tue, 29 May 2018 07:52:00 GMT
SammiChen created HADOOP-15499:
----------------------------------

             Summary: Performance several drop when running RawErasureCoderBenchmark with
NativeRSRawErasureCoder
                 Key: HADOOP-15499
                 URL: https://issues.apache.org/jira/browse/HADOOP-15499
             Project: Hadoop Common
          Issue Type: Improvement
    Affects Versions: 3.0.2, 3.0.1, 3.0.0
            Reporter: SammiChen
            Assignee: SammiChen


Run RawErasureCoderBenchmark  which is a micro-benchmark to test EC codec encoding/decoding
performance. 

50 concurrency Native ISA-L coder has the less throughput than 1 concurrency Native ISA-L
case. It's abnormal. 

 

bin/hadoop jar ./share/hadoop/common/hadoop-common-3.2.0-SNAPSHOT-tests.jar org.apache.hadoop.io.erasurecode.rawcoder.RawErasureCoderBenchmark
encode 3 1 1024 1024
Using 126MB buffer.
ISA-L coder encode 1008MB data, with chunk size 1024KB
Total time: 0.19 s.
Total throughput: 5390.37 MB/s
Threads statistics:
1 threads in total.
Min: 0.18 s, Max: 0.18 s, Avg: 0.18 s, 90th Percentile: 0.18 s.

 

bin/hadoop jar ./share/hadoop/common/hadoop-common-3.2.0-SNAPSHOT-tests.jar org.apache.hadoop.io.erasurecode.rawcoder.RawErasureCoderBenchmark
encode 3 50 1024 10240
Using 120MB buffer.
ISA-L coder encode 54000MB data, with chunk size 10240KB
Total time: 11.58 s.
Total throughput: 4662 MB/s
Threads statistics:
50 threads in total.
Min: 0.55 s, Max: 11.5 s, Avg: 6.32 s, 90th Percentile: 10.45 s.

 

RawErasureCoderBenchmark shares a single coder between all concurrent threads. While 

NativeRSRawEncoder and NativeRSRawDecoder has synchronized key work on doDecode and doEncode
function. So 50 concurrent threads are forced to use the shared coder encode/decode function
one by one. 

 

To resolve the issue, there are two approaches. 
 # Refactor RawErasureCoderBenchmark  to use dedicated coder for each concurrent thread.
 # Refactor NativeRSRawEncoder  and NativeRSRawDecoder  to get better concurrency.  Since
the synchronized key work is to try to protect the private variable nativeCoder from being
checked in doEncode/doDecode and  being modified in release.  We can use reentrantReadWriteLock
to increase the concurrency since doEncode/doDecode can be called multiple times without change
the nativeCoder state.

 I prefer approach 2 and will upload a patch later. 

 

 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org


Mime
View raw message