hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John George (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1782) FSNamesystem.startFileInternal(..) throws NullPointerException
Date Mon, 28 Mar 2011 22:26:06 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012258#comment-13012258
] 

John George commented on HDFS-1782:
-----------------------------------

This was caused because of a null pointer deference. The change was tested by the same person
who filed this bug and it passed. 

I thought about adding a test to run multiple balancers to as part of the unit test, but the
timing of balancers does not guarantee that this will always test the change that was put
in. Adding a "mock" test was a possibility, but it seems to me that it requires source code
change to be able to "mock" and test the particular change.


    [exec] BUILD SUCCESSFUL
     [exec] Total time: 49 seconds
     [exec] 
     [exec] 
     [exec] 
     [exec] 
     [exec] -1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     -1 tests included.  The patch doesn't appear to include any new or modified
tests.
     [exec]                         Please justify why no new tests are needed for this patch.
     [exec]                         Also please list what manual steps were performed to verify
this patch.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler
warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9)
warnings.
     [exec] 
     [exec]     +1 release audit.  The applied patch does not increase the total number of
release audit warnings.
     [exec] 
     [exec]     +1 system test framework.  The patch passed system test framework compile.
     [exec] 


> FSNamesystem.startFileInternal(..) throws NullPointerException
> --------------------------------------------------------------
>
>                 Key: HDFS-1782
>                 URL: https://issues.apache.org/jira/browse/HDFS-1782
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 0.22.0
>            Reporter: John George
>            Assignee: John George
>             Fix For: 0.22.0
>
>         Attachments: HDFS-1782.patch
>
>
> I'm observing when there is one balancer running trying to run another one results in
> "Java.lang.NullPointerException" error. I was hoping to see message "Another balancer
is running. 
> Exiting....  Exiting ...". This is a reproducible issue.
> Details
> ========
> 1) Cluster ->elrond
> [hdfs@gsbl90568 smilli]$ hadoop version
> Hadoop 0.22.0.1102280202
> Subversion git://hadoopre5.corp.sk1.yahoo.com/home/y/var/builds/thread2/workspace/Cloud-HadoopCOMMON-0.22-Secondary
-r
> c7c9a21d7289e29f0133452acf8b761e455a84b5
> Compiled by hadoopqa on Mon Feb 28 02:12:38 PST 2011
> From source with checksum 9ecbc6f17e8847a1cddca2282dbd9b31
> [hdfs@gsbl90568 smilli]$
> 2) Run first balancer
> [hdfs@gsbl90565 smilli]$ hdfs balancer
> 11/03/09 16:33:56 INFO balancer.Balancer: namenodes = [gsbl90565.blue.ygrid.yahoo.com/98.137.97.57:8020,
> gsbl90569.blue.ygrid.yahoo.com/98.137.97.53:8020]
> 11/03/09 16:33:56 INFO balancer.Balancer: p         = Balancer.Parameters[BalancingPolicy.Node,
threshold=10.0]
> Time Stamp               Iteration#  Bytes Already Moved  Bytes Left To Move  Bytes Being
Moved
> 11/03/09 16:33:57 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use
mapreduce.task.attempt.id
> 11/03/09 16:33:57 INFO balancer.Balancer: Block token params received from NN: keyUpdateInterval=600
min(s),
> tokenLifetime=600 min(s)
> 11/03/09 16:33:57 INFO block.BlockTokenSecretManager: Setting block keys
> 11/03/09 16:33:57 INFO balancer.Balancer: Balancer will update its block keys every 150
minute(s)
> 11/03/09 16:33:57 INFO block.BlockTokenSecretManager: Setting block keys
> 11/03/09 16:33:57 INFO balancer.Balancer: Block token params received from NN: keyUpdateInterval=600
min(s),
> tokenLifetime=600 min(s)
> 11/03/09 16:33:57 INFO block.BlockTokenSecretManager: Setting block keys
> 11/03/09 16:33:57 INFO balancer.Balancer: Balancer will update its block keys every 150
minute(s)
> 11/03/09 16:33:57 INFO block.BlockTokenSecretManager: Setting block keys
> 11/03/09 16:33:57 INFO net.NetworkTopology: Adding a new node: /98.137.97.0/98.137.97.62:1004
> 11/03/09 16:33:57 INFO net.NetworkTopology: Adding a new node: /98.137.97.0/98.137.97.58:1004
> 11/03/09 16:33:57 INFO net.NetworkTopology: Adding a new node: /98.137.97.0/98.137.97.60:1004
> 11/03/09 16:33:57 INFO net.NetworkTopology: Adding a new node: /98.137.97.0/98.137.97.59:1004
> 11/03/09 16:33:57 INFO balancer.Balancer: 1 over-utilized: [Source[98.137.97.62:1004,
utilization=24.152507825759344]]
> 11/03/09 16:33:57 INFO balancer.Balancer: 0 underutilized: []
> 11/03/09 16:33:57 INFO balancer.Balancer: Need to move 207.98 GB to make the cluster
balanced.
> 11/03/09 16:33:57 INFO balancer.Balancer: Decided to move 10 GB bytes from 98.137.97.62:1004
to 98.137.97.58:1004
> 11/03/09 16:33:57 INFO balancer.Balancer: Will move 10 GB in this iteration
> Mar 9, 2011 4:33:57 PM            0                 0 KB           207.98 GB        
     10 GB
> .
> .
> .
> 11/03/09 16:34:36 INFO balancer.Balancer: Moving block -63570336576981940 from 98.137.97.62:1004
to 98.137.97.59:1004
> through 98.137.97.62:1004 is succeeded.
> 11/03/09 16:34:39 INFO balancer.Balancer: Moving block 2379736326585824737 from 98.137.97.62:1004
to 98.137.97.59:1004
> through 98.137.97.62:1004 is succeeded.
> 11/03/09 16:35:21 INFO balancer.Balancer: Moving block 8884583953927078028 from 98.137.97.62:1004
to 98.137.97.59:1004
> through 98.137.97.62:1004 is succeeded.
> 11/03/09 16:35:24 INFO balancer.Balancer: Moving block -135758138424743964 from 98.137.97.62:1004
to 98.137.97.59:1004
> through 98.137.97.62:1004 is succeeded.
> 11/03/09 16:35:27 INFO balancer.Balancer: Moving block -4598153351946352185 from 98.137.97.62:1004
to 98.137.97.59:1004
> through 98.137.97.62:1004 is succeeded.
> 11/03/09 16:35:33 INFO balancer.Balancer: Moving block 2966087210491094643 from 98.137.97.62:1004
to 98.137.97.59:1004
> through 98.137.97.62:1004 is succeeded.
> 11/03/09 16:35:42 INFO balancer.Balancer: Moving block -5573983508500804184 from 98.137.97.62:1004
to 98.137.97.59:1004
> through 98.137.97.62:1004 is succeeded.
> 11/03/09 16:35:58 INFO balancer.Balancer: Moving block -6222779741597113957 from 98.137.97.62:1004
to 98.137.97.59:1004
> through 98.137.97.62:1004 is succeeded.
> 3) Run another balancer observe
> [hdfs@gsbl90568 smilli]$ hdfs balancer
> 11/03/09 16:34:32 INFO balancer.Balancer: namenodes = [gsbl90565.blue.ygrid.yahoo.com/98.137.97.57:8020,
> gsbl90569.blue.ygrid.yahoo.com/98.137.97.53:8020]
> 11/03/09 16:34:32 INFO balancer.Balancer: p         = Balancer.Parameters[BalancingPolicy.Node,
threshold=10.0]
> Time Stamp               Iteration#  Bytes Already Moved  Bytes Left To Move  Bytes Being
Moved
> 11/03/09 16:34:33 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use
mapreduce.task.attempt.id
> 11/03/09 16:34:33 INFO balancer.Balancer: Block token params received from NN: keyUpdateInterval=600
min(s),
> tokenLifetime=600 min(s)
> 11/03/09 16:34:33 INFO block.BlockTokenSecretManager: Setting block keys
> 11/03/09 16:34:33 INFO balancer.Balancer: Balancer will update its block keys every 150
minute(s)
> 11/03/09 16:34:33 INFO block.BlockTokenSecretManager: Setting block keys
> java.io.IOException: java.lang.NullPointerException
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1400)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1284)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:779)
>         at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:346)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1399)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1395)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1094)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1393)
> .  Exiting ...
> Balancing took 1.366 seconds
> [hdfs@gsbl90568 smilli]$
> Pls let me know if you need additional information.
> Thanks,
> Suma
> 		

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message