hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Li (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-11238) Group Cache should not cause namenode pause
Date Fri, 05 Dec 2014 19:41:13 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-11238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chris Li updated HADOOP-11238:
------------------------------
    Description: 
This patch addresses an issue where the namenode pauses during group resolution by only allowing
a single group resolution query on expiry. There are two scenarios:
1. When there is not yet a value in the cache, all threads which make a request will block
while a single thread fetches the value.
2. When there is already a value in the cache and it is expired, the new value will be fetched
in the background while the old value is used by other threads
This is handled by guava's cache.

Negative caching is a feature built into the groups cache, and since guava's caches don't
support different expiration times, we have a separate negative cache which masks the guava
cache: if an element exists in the negative cache and isn't expired, we return it.

In total the logic for fetching a group is:
1. If username exists in static cache, return the value (this was already present)
2. If username exists in negative cache and negative cache is not expired, raise an exception
as usual
3. Otherwise Defer to guava cache (see two scenarios above)


Original Issue Below:
----------------------------
Our namenode pauses for 12-60 seconds several times every hour. During these pauses, no new
requests can come in.

Around the time of pauses, we have log messages such as:
2014-10-22 13:24:22,688 WARN org.apache.hadoop.security.Groups: Potential performance problem:
getGroups(user=xxxxx) took 34507 milliseconds.

The current theory is:
1. Groups has a cache that is refreshed periodically. Each entry has a cache expiry.
2. When a cache entry expires, multiple threads can see this expiration and then we have a
thundering herd effect where all these threads hit the wire and overwhelm our LDAP servers
(we are using ShellBasedUnixGroupsMapping with sssd, how this happens has yet to be established)
3. group resolution queries begin to take longer, I've observed it taking 1.2 seconds instead
of the usual 0.01-0.03 seconds when measuring in the shell `time groups myself`
4. If there is mutual exclusion somewhere along this path, a 1 second pause could lead to
a 60 second pause as all the threads compete for the resource. The exact cause hasn't been
established

Potential solutions include:
1. Increasing group cache time, which will make the issue less frequent
2. Rolling evictions of the cache so we prevent the large spike in LDAP queries
3. Gate the cache refresh so that only one thread is responsible for refreshing the cache



  was:
This patch prevents the namenode from pausing during group cache expiry when getGroups takes
a long time by returning the previous value

Original Issue Below:
----------------------------
Our namenode pauses for 12-60 seconds several times every hour. During these pauses, no new
requests can come in.

Around the time of pauses, we have log messages such as:
2014-10-22 13:24:22,688 WARN org.apache.hadoop.security.Groups: Potential performance problem:
getGroups(user=xxxxx) took 34507 milliseconds.

The current theory is:
1. Groups has a cache that is refreshed periodically. Each entry has a cache expiry.
2. When a cache entry expires, multiple threads can see this expiration and then we have a
thundering herd effect where all these threads hit the wire and overwhelm our LDAP servers
(we are using ShellBasedUnixGroupsMapping with sssd, how this happens has yet to be established)
3. group resolution queries begin to take longer, I've observed it taking 1.2 seconds instead
of the usual 0.01-0.03 seconds when measuring in the shell `time groups myself`
4. If there is mutual exclusion somewhere along this path, a 1 second pause could lead to
a 60 second pause as all the threads compete for the resource. The exact cause hasn't been
established

Potential solutions include:
1. Increasing group cache time, which will make the issue less frequent
2. Rolling evictions of the cache so we prevent the large spike in LDAP queries
3. Gate the cache refresh so that only one thread is responsible for refreshing the cache




> Group Cache should not cause namenode pause
> -------------------------------------------
>
>                 Key: HADOOP-11238
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11238
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 2.5.1
>            Reporter: Chris Li
>            Assignee: Chris Li
>            Priority: Minor
>         Attachments: HADOOP-11238.patch
>
>
> This patch addresses an issue where the namenode pauses during group resolution by only
allowing a single group resolution query on expiry. There are two scenarios:
> 1. When there is not yet a value in the cache, all threads which make a request will
block while a single thread fetches the value.
> 2. When there is already a value in the cache and it is expired, the new value will be
fetched in the background while the old value is used by other threads
> This is handled by guava's cache.
> Negative caching is a feature built into the groups cache, and since guava's caches don't
support different expiration times, we have a separate negative cache which masks the guava
cache: if an element exists in the negative cache and isn't expired, we return it.
> In total the logic for fetching a group is:
> 1. If username exists in static cache, return the value (this was already present)
> 2. If username exists in negative cache and negative cache is not expired, raise an exception
as usual
> 3. Otherwise Defer to guava cache (see two scenarios above)
> Original Issue Below:
> ----------------------------
> Our namenode pauses for 12-60 seconds several times every hour. During these pauses,
no new requests can come in.
> Around the time of pauses, we have log messages such as:
> 2014-10-22 13:24:22,688 WARN org.apache.hadoop.security.Groups: Potential performance
problem: getGroups(user=xxxxx) took 34507 milliseconds.
> The current theory is:
> 1. Groups has a cache that is refreshed periodically. Each entry has a cache expiry.
> 2. When a cache entry expires, multiple threads can see this expiration and then we have
a thundering herd effect where all these threads hit the wire and overwhelm our LDAP servers
(we are using ShellBasedUnixGroupsMapping with sssd, how this happens has yet to be established)
> 3. group resolution queries begin to take longer, I've observed it taking 1.2 seconds
instead of the usual 0.01-0.03 seconds when measuring in the shell `time groups myself`
> 4. If there is mutual exclusion somewhere along this path, a 1 second pause could lead
to a 60 second pause as all the threads compete for the resource. The exact cause hasn't been
established
> Potential solutions include:
> 1. Increasing group cache time, which will make the issue less frequent
> 2. Rolling evictions of the cache so we prevent the large spike in LDAP queries
> 3. Gate the cache refresh so that only one thread is responsible for refreshing the cache



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message