hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ming LI (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HAWQ-1030) User hang due to poor spin-lock/LWLock performance under high concurrency
Date Mon, 29 Aug 2016 08:40:20 GMT
Ming LI created HAWQ-1030:
-----------------------------

             Summary: User hang due to poor spin-lock/LWLock performance under high concurrency
                 Key: HAWQ-1030
                 URL: https://issues.apache.org/jira/browse/HAWQ-1030
             Project: Apache HAWQ
          Issue Type: Bug
          Components: Core
            Reporter: Ming LI
            Assignee: Lei Chang


Some clients have recently reported apparent hangs with their applications. In all cases the
symptoms were the same:

* All sessions appear to be hung in LWLockAcquire or Release, specifically s_lock
* there is a high number of concurrent sessions (close to 100)
* System is not actually hung, normally processing resumes after some period of time when
all sessions have completed their locking work

The postgresql developer community has found several issues with performance under high concurrency
(> 32 sessions) in the spin-lock mechanism we've inherited in HAWQ. This ultimately has
been corrected in 9.5 with a replacement to the spin-lock mechanism and appears to provide
a significant boost to query performance.

The actual fix is in commit: ab5194e6f617a9a9e7aadb3dd1cee948a42d0755

Only 1 line commit to s_lock.c could help address this and would be easy enough to cherry-pick:
b03d196be055450c7260749f17347c2d066b4254



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message