hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-9439) JniBasedUnixGroupsMapping: fix some crash bugs
Date Tue, 14 May 2013 18:43:16 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-9439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13657348#comment-13657348
] 

Todd Lipcon commented on HADOOP-9439:
-------------------------------------

{code}
+  static String empty[] = new String[0];
{code}

Should probably be static final, right? And rename to {{NO_GROUPS}} or {{EMPTY_STIRNG_ARRAY}}
or something a little more descriptive

----

{code}
+  native static String[] getGroupsForUser(String username, String empty[],
+        boolean reentrant);
{code}

I don't get why {{empty}} is a parameter here... aside from it being a premature optimization
(is allocating an empty array that bad?), it seems like the JNI code could just grab this
field itself and return a reference, rather than taking it as a parameter.

----
{code}
+      groups = getGroupsForUser(user, empty, useReentrant );
{code}
Style nit: extra space here before ')'

----
{code}
+      snprintf(buf, sizeof(buf), "getgrouplist error %d", ret);
+      THROW(env, "java/lang/RuntimeException", buf);
{code}

Can we use strerror here on the return, assuming it's probably a standard errnum?

----
{code}
+    ret = hadoop_group_info_fetch(ginfo, uinfo->gids[i]);
+    if (!ret) {
{code}
Here if the group info lookup fails for a particular gid, you end up swallowing the error
without any warnings, etc. Maybe instead we should just return the numeric gid as a group?
This is what the 'groups' shell command does:

{code}
todd@todd-w510:~$ groups todd
todd : groups: cannot find name for group ID 1050
1050 adm dialout cdrom audio video plugdev lpadmin admin sambashare mrtest wireshark
{code}

----
- I noticed you use different locks for getgrnam and getpwnam. But when I was working on HADOOP-7156,
I found that the buggy implementations were racey in other parts of their code -- ie a concurrent
getgrnam and a concurrent getpwnam might race with eachother and cause either to fail. I know
that you added this function to deal with a crash we saw on a production cluster - did you
verify that with the same buggy underlying implementation that the workaround prevents the
issue?

We may have to actually share this lock and configuration with the lock used in NativeIO.c.

Additionally, when I did that patch, I found it simpler to continue to use the reentrant functions
wrapped with a lock -- less repeated code, etc. 
                
> JniBasedUnixGroupsMapping: fix some crash bugs
> ----------------------------------------------
>
>                 Key: HADOOP-9439
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9439
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: native
>    Affects Versions: 2.0.4-alpha
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>            Priority: Minor
>         Attachments: HADOOP-9439.001.patch, HDFS-4640.002.patch
>
>
> JniBasedUnixGroupsMapping has some issues.
> * sometimes on error paths variables are freed prior to being initialized
> * re-allocate buffers less frequently (can reuse the same buffer for multiple calls to
getgrnam)
> * allow non-reentrant functions to be used, to work around client bugs
> * don't throw IOException from JNI functions if the JNI functions do not declare this
checked exception.
> * don't bail out if only one group name among all the ones associated with a user can't
be looked up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message