hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "rcuso (Jira)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-17079) Optimize UGI#getGroups by adding UGI#getGroupsSet
Date Tue, 01 Sep 2020 20:57:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-17079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17188817#comment-17188817

rcuso commented on HADOOP-17079:

Hey Xiaoyu, tests look real good.  Ran the below command to generate RPC activity using an
account mapped to zero groups, and then another mapped to 50k groups.  Before, the patch,
it took 11 minutes and 99 minutes, respectively.  With the patch, both take 11 minutes.

hdfs org.apache.hadoop.fs.slive.SliveTest -Dmapreduce.job.queuename=reserve -baseDir /user/USER1/
-resFile /tmp/h1.USER1 -ls 99 -maps 5000 -ops 5000

We have many contrasting graphs related to RPC throughput, latency, and lock length, though
I'd have to get clearance to attach them here.

Thanks - Rob

> Optimize UGI#getGroups by adding UGI#getGroupsSet
> -------------------------------------------------
>                 Key: HADOOP-17079
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17079
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Xiaoyu Yao
>            Assignee: Xiaoyu Yao
>            Priority: Major
>             Fix For: 3.4.0
>         Attachments: HADOOP-17079.002.patch, HADOOP-17079.003.patch, HADOOP-17079.004.patch,
HADOOP-17079.005.patch, HADOOP-17079.006.patch, HADOOP-17079.007.patch
> UGI#getGroups has been optimized with HADOOP-13442 by avoiding the List->Set->List
conversion. However the returned list is not optimized to contains lookup, especially the
user's group membership list is huge (thousands+) . This ticket is opened to add a UGI#getGroupsSet
and use Set#contains() instead of List#contains() to speed up large group look up while minimize
List->Set conversions in Groups#getGroups() call. 

This message was sent by Atlassian Jira

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message