hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gopal V (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HADOOP-10681) Compressor inner methods are all synchronized - within a tight loop
Date Wed, 11 Jun 2014 17:43:01 GMT
Gopal V created HADOOP-10681:
--------------------------------

             Summary: Compressor inner methods are all synchronized - within a tight loop
                 Key: HADOOP-10681
                 URL: https://issues.apache.org/jira/browse/HADOOP-10681
             Project: Hadoop Common
          Issue Type: Bug
          Components: performance
    Affects Versions: 2.4.0, 2.2.0, 2.5.0
            Reporter: Gopal V
         Attachments: compress-cmpxchg-small.png, perf-top-spill-merge.png

The current implementation of SnappyCompressor spends more time within the java loop of copying
from the user buffer into the direct buffer allocated to the compressor impl, than the time
it takes to compress the buffers.

!perf-top-spill-merge.png!

The bottleneck was found to be java monitor code inside SnappyCompressor.

The methods are neatly inlined by the JIT into the parent caller (BlockCompressorStream::write),
which unfortunately does not flatten out the synchronized blocks.

!compress-cmpxchg-small.png!

The loop does a write of small byte[] buffers (each IFile key+value). 

I counted approximately 6 monitor enter/exit blocks per k-v pair written.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message