hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rushabh S Shah (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-12667) KMSClientProvider#ValueQueue does synchronous fetch of edeks in background async thread.
Date Sat, 28 Oct 2017 21:59:00 GMT

     [ https://issues.apache.org/jira/browse/HDFS-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Rushabh S Shah updated HDFS-12667:
----------------------------------
    Status: Patch Available  (was: Open)

submitting again..hopefully jenkins will run on the last patch.

> KMSClientProvider#ValueQueue does synchronous fetch of edeks in background async thread.
> ----------------------------------------------------------------------------------------
>
>                 Key: HDFS-12667
>                 URL: https://issues.apache.org/jira/browse/HDFS-12667
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: encryption, kms
>    Affects Versions: 3.0.0-alpha4
>            Reporter: Rushabh S Shah
>            Assignee: Rushabh S Shah
>         Attachments: HDFS-12667-001.patch
>
>
> There are couple of issues in KMSClientProvider#ValueQueue.
> 1.
>  {code:title=ValueQueue.java|borderStyle=solid}
>   private final LoadingCache<String, LinkedBlockingQueue<E>> keyQueues;
>   // Stripped rwlocks based on key name to synchronize the queue from
>   // the sync'ed rw-thread and the background async refill thread.
>   private final List<ReadWriteLock> lockArray =
>       new ArrayList<>(LOCK_ARRAY_SIZE);
> {code}
> It hashes the key name into 16 buckets.
> In the code chunk below,
>  {code:title=ValueQueue.java|borderStyle=solid}
> public List<E> getAtMost(String keyName, int num) throws IOException,
>       ExecutionException {
>      ...
>      ...
>          readLock(keyName);
>         E val = keyQueue.poll();
>         readUnlock(keyName);
>      ...
>   }
>   private void submitRefillTask(final String keyName,
>       final Queue<E> keyQueue) throws InterruptedException {
>               ...
>               ...
>               writeLock(keyName); // It holds the write lock while the key is being asynchronously
fetched. So the read requests for all the keys that hashes to this bucket will essentially
be blocked.
>               try {
>                 if (keyQueue.size() < threshold && !isCanceled()) {
>                   refiller.fillQueueForKey(name, keyQueue,
>                       cacheSize - keyQueue.size());
>                 }
>              ...
>               } finally {
>                 writeUnlock(keyName);
>               }
>             }
>   }
> {code}
> According to above code chunk, if two keys (lets say key1 and key2) hashes to the same
bucket (between 1 and 16), then if key1 is asynchronously being refetched then all the getKey
for key2 will be blocked.
> 2. Due to stripped rw locks, the asynchronous behavior of refill keys is now synchronous
to other handler threads.
> I understand that locks were added so that we don't kick off multiple asynchronous refilling
thread for the same key.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message