harmony-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Kuksenko (JIRA)" <j...@apache.org>
Subject [jira] Created: (HARMONY-3995) [drlvm][threading][performance] Performance improvement for uncontended synchronization.
Date Tue, 29 May 2007 12:48:16 GMT
[drlvm][threading][performance] Performance improvement for uncontended synchronization.

                 Key: HARMONY-3995
                 URL: https://issues.apache.org/jira/browse/HARMONY-3995
             Project: Harmony
          Issue Type: Improvement
          Components: DRLVM
            Reporter: Sergey Kuksenko

It is fact that even simple atomic instructions (lock cmpxchg, etc...) have a big influence
on performance especially for multyprocessors systems. DRLVM uses reservation locks scheme
for uncontended synchronizarion. Here is in case of local (from the single thread) and uncontended
synchronization all monitor_enter and monitor_enter primitives are executed without atomic
instructions. In case of non-local (from several threads) and still uncontended synchronization
DRLVM uses thin-locks scheme (with atomic instructions).  Lock unreservation is rather expensive
operation because of necessity to stop the owner thread. That is why DRLVM uses unreservation
only once - for transferring to thin lock. From the other side there is a common situation
which are not covered by the current scheme - it is transferring locality - when after several
synchronizations from one thread data are tranferred to another thread and locality (access
from one thread) is continued in new thread. 
The attached patch provide improvement in case of tranferring locality. The following heuristics
is used:
- If at the moment of unreservation the owner thread is already stopped then the lock will
be unreserved but won't be switched to thin lock state. The lock stays in reservation mode
and will be reserved for the next thread tryied to acquire it. In others words if unreservation
costs nothing (thread is already stopped (in wait, sleep, terminated ... state)) then DRLVM
unreserve the lock but save it for future reservations.
There are a bunch of applications where it gives a performance boost. Also I've attached a
microbenchmark which shows the real performance boost of the patch. From the other site we
need to do additional investigation where the patch gives boost. That is why the patch doesn't
change the current unreservation. The patch introduses new option "-XX:thread.soft_unreservation"
which is turned off by default. Turning it on allows to use new unreservation (soft) scheme.

Some datails about attached microbenchmark. Here I emulates the following scenario:
- the main thread creates a bunch of data (objects with synchronized access) 
- the main thread separates all data for 4 "processing" threads
- the main thread runs 4 processing threads and waits results from them.

The number is amount of synchronized operations divided by 10. (then more then better)
For example:
synchronized OPS     = 7147           - Here is we have ~71470 synch ops per second.
non-synchronized OPS = 19891    - 
The last number shows speed of the same operations without any synchronization.
Ratio between synchronized and non-synchronized OPS shows the dagradation caused by synchronization
(even uncontended).
Here is some measurements for Sun1.6 and DRLVM on the microbench:
1. Sun1.6
CMDLINE:  java -server -jar synchTest.jar

Measure phase; threads(4); time(180)
synchronized OPS     = 7886
non-synchronized OPS = 59907

2.1 java -XX:thread.soft_unreservation=false -Xem:server -jar synchTest.jar

synchronized OPS     = 7939
non-synchronized OPS = 50985

2.1 java -XX:thread.soft_unreservation=true -Xem:server -jar synchTest.jar

synchronized OPS     = 25735
non-synchronized OPS = 50998

Thus turning the option on gives DRLVM speedup of 3.2x times. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message