hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Helmling <ghelml...@gmail.com>
Subject Re: HBaseTestingUtility Failing
Date Mon, 15 Jul 2013 17:25:22 GMT
Hi David,

I can't say for certain given the test output (would need to see the logs
generated by the datanodes), but there is a common issue running mini
cluster unit tests if the environment's umask is not set to 022.  This is
due to https://issues.apache.org/jira/browse/HDFS-2556 where the mini
cluster DNs fail fast if the dfs.data.dir permissions do not match what is
expected.

We ran into this issue setting up builds or our hRaven project (which uses
HBaseTestingUtility) with Travis CI.  Modifying our .travis.yml file by
adding the following line fixed the problem:

script: umask 0022 && mvn test


This should override the umask for the test process for your project.

--gh


On Mon, Jul 15, 2013 at 10:17 AM, David Williams
<mobiusinversion@gmail.com>wrote:

> I doubt this has anything todo specifically with CentOS etc.  I am not
> privy to your setup up but the steps to reproduce are easy.
>
> 1.  Get a linux distro such as CentOS.   Put it say, in a VM such as
> Vmware Fusion on Mac.
> 2.  Install Java
> 3.  Try to start a mini cluster (will fail)
>
> The bottom line is there is an uncaught null pointer exception in the
> Testing Utility.  Is this considered to be of little priority?
>
> > Here is Linux OS:
>
> > Linux a.net 2.6.32-220.23.1.el6.YAHOO.20120713.x86_64 #1 SMP Fri Jul 13
> 11:40:51 CDT 2012 x86_64 x86_64 x86_64 GNU/Linux
>
> Ted, what linux distribution is that?
>
>
>
>
>
> On Jul 15, 2013, at 9:49 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>
> > I would be busy today so expect delay in future response.
> >
> > bq. Recall that this is not an issue on Mac OSX for some odd reason
> >
> > I didn't reproduce the issue you mentioned on either OSX or Linux.
> >
> > Maybe people who use CentOS for ci can comment on this matter.
> >
> > Cheers
> >
> > On Mon, Jul 15, 2013 at 8:01 AM, David Williams <
> mobiusinversion@gmail.com> wrote:
> > Hi Ted,
> >
> > So, what would you recommend in terms of getting a working
> HbaseTestingUtility or just unit testing in general?   It's pretty much a
> prerequisite for me to be able to introduce HBase into production.  We have
> a strict continuous integration process and I need to be able to build
> software for use with our Hbase cluster in our CI environment including
> running all tests.  Our CI environment does not have its own test cluster
> and the chances of that becoming the case are slim.  Besides the relative
> convenience of the testing utility is great.
> >
> > > check out
> http://svn.apache.org/repos/asf/hadoop/common/branches/branch-1
> >
> > I don't know anything about the hadoop source, this branch, or what is
> needed to run their JUnit tests.  This is pretty much a non starter for me.
> >
> > > On my Mac, I got:
> > > [junit] Running
> org.apache.hadoop.tools.distcp2.mapred.TestCopyCommitter
> > > [junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 12.283 sec
> >
> > Recall that this is not an issue on Mac OSX for some odd reason, only on
> CentOS and Linux VM's such as in use by travis-ci.  You can see the failed
> build output here:
> >
> > https://travis-ci.org/mobiusinversion/hbase
> >
> > Recall also that I opened a travis-ci ticket and the travis support was
> able to reproduce success on Mac OSX and failure on Linux.
> >
> > https://github.com/travis-ci/travis-ci/issues/1240
> >
> > I started a bounty on SO:
> >
> >
> http://stackoverflow.com/questions/17625938/hbase-minidfscluster-java-fails-in-certain-environments
> >
> >
> > St.Ack - what do you think the relative priority of this issue might be?
> >
> >
> >
> >
> >
> >
> >
> > On Jul 14, 2013, at 10:47 AM, Ted Yu wrote:
> >
> >> Here is one way of figuring out whether hdfs experiences the same issue
> is the following:
> >>
> >> check out
> http://svn.apache.org/repos/asf/hadoop/common/branches/branch-1
> >> use the command to find out which tests create MiniDFSCluster: find .
> -name '*.java' -exec grep 'new MiniDFSCluster(' {} \; -print
> >> run one of the tests: ant test-core -Dtestcase=TestCopyCommitter
> >>
> >> See if there is similar exception.
> >>
> >> On my Mac, I got:
> >>
> >>     [junit] Running
> org.apache.hadoop.tools.distcp2.mapred.TestCopyCommitter
> >>     [junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 12.283
> sec
> >>
> >> On Sun, Jul 14, 2013 at 9:58 AM, David Williams <
> mobiusinversion@gmail.com> wrote:
> >> Hi Ted,
> >>
> >> I'd be interested to know more about an hdfs issue.  I was go to look
> further into the line of reasoning you mentioned about the call to
> getAddress.
> >>
> >> MiniDFSCluster.java, line 426:
> >> String ipAddr = dn.getSelfAddr().getAddress().getHostAddress();
> >>
> >> So that one of getSelfAddr or getAddress returned NULL.   Is there
> something new about hdfs or are the two related?
> >>
> >>
> >>
> >>
> >>
> >> On Jul 13, 2013, at 9:41 PM, Ted Yu wrote:
> >>
> >>> Most likely this is an hdfs issue.
> >>>
> >>> On Sat, Jul 13, 2013 at 4:08 PM, David Williams <
> mobiusinversion@gmail.com> wrote:
> >>> Hi Ted,
> >>>
> >>> I updated the dependencies and ran the tests again, and on my Mac OSX
> they pass and on CentOS I get the same error:
> >>>
> >>>
> >>> $ lein test
> >>> Retrieving org/apache/hbase/hbase/0.94.9/hbase-0.94.9.pom from central
> >>> Retrieving org/apache/hbase/hbase/0.94.9/hbase-0.94.9.jar from central
> >>> Retrieving org/apache/hbase/hbase/0.94.9/hbase-0.94.9-tests.jar from
> central
> >>>
> >>> lein test hbase.config-test
> >>>
> >>> lein test hbase.table-test
> >>> Starting DataNode 0 with dfs.data.dir:
> /home/dwilliams/Desktop/Repos/hbase/target/test-data/1140edc6-7242-40cd-8ed8-05847fb14949/dfscluster_1e40ce89-1986-450b-ba6d-983caa9aeb78/dfs/data/data1,/home/dwilliams/Desktop/Repos/hbase/target/test-data/1140edc6-7242-40cd-8ed8-05847fb14949/dfscluster_1e40ce89-1986-450b-ba6d-983caa9aeb78/dfs/data/data2
> >>>
> >>> lein test :only hbase.table-test/create-table
> >>>
> >>>
> >>> ERROR in (create-table) (MiniDFSCluster.java:426)
> >>> Uncaught exception, not in assertion.
> >>> expected: nil
> >>>   actual: java.lang.NullPointerException: null
> >>>  at org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes
> (MiniDFSCluster.java:426)
> >>>     org.apache.hadoop.hdfs.MiniDFSCluster.<init>
> (MiniDFSCluster.java:284)
> >>>     org.apache.hadoop.hbase.HBaseTestingUtility.startMiniDFSCluster
> (HBaseTestingUtility.java:451)
> >>>     org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster
> (HBaseTestingUtility.java:619)
> >>>     org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster
> (HBaseTestingUtility.java:575)
> >>>     org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster
> (HBaseTestingUtility.java:562)
> >>>     hbase.table_test$test_config.doInvoke (table_test.clj:10)
> >>>     clojure.lang.RestFn.invoke (RestFn.java:397)
> >>>     hbase.table_test/fn (table_test.clj:19)
> >>>
> >>> On Jul 13, 2013, at 1:11 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> >>>
> >>>> Do you mind trying the following change to see if the problem
> persists for latest 0.94 release ?
> >>>>
> >>>> Thanks
> >>>>
> >>>> diff --git a/project.clj b/project.clj
> >>>> index 2554784..0d8be0e 100644
> >>>> --- a/project.clj
> >>>> +++ b/project.clj
> >>>> @@ -5,8 +5,8 @@
> >>>>         :dependencies [
> >>>>                 [org.clojure/clojure "1.5.1"]
> >>>>                 [org.apache.hadoop/hadoop-core "1.2.0"]
> >>>> -               [org.apache.hbase/hbase "0.94.6.1"]
> >>>> +               [org.apache.hbase/hbase "0.94.9"]
> >>>>                 [org.apache.hadoop/hadoop-test "1.2.0"]
> >>>> -               [org.apache.hbase/hbase "0.94.6.1" :classifier
> "tests"]]
> >>>> +               [org.apache.hbase/hbase "0.94.9" :classifier "tests"]]
> >>>>         :plugins [[lein-marginalia "0.7.1"]])
> >>>>
> >>>>
> >>>> On Fri, Jul 12, 2013 at 10:22 PM, David Williams <
> mobiusinversion@gmail.com> wrote:
> >>>> Hi Ted,
> >>>>
> >>>> In terms of versions, here are the jars I'm using, which come from
> Maven Central.
> >>>>
> >>>> org.apache.hadoop/hadoop-core "1.2.0"
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> org.apache.hbase/hbase "0.94.6.1"
> >>>> org.apache.hadoop/hadoop-test "1.2.0"
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> org.apache.hbase/hbase "0.94.6.1" :classifier "tests"
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> The flag ':classifier "tests"' above is a specific instruction to to
> the Leiningen 2.0 to use the pomegranate library to handle Sonatype Aether
> and dynamic runtime modification of the classpath, which in this case is
> needed to import org.apache.hadoop.hbase HBaseTestingUtility.
> >>>>
> >>>> https://github.com/cemerick/pomegranate
> >>>>
> >>>>
> >>>> I just checked on address resolution, on my Mac OSX where the
> TestingUtility passes:
> >>>>
> >>>> user=> (import 'java.net.InetSocketAddress)
> >>>> java.net.InetSocketAddress
> >>>> user=> (def x (InetSocketAddress. 8000))
> >>>> #'user/x
> >>>> user=> (.getAddress x)
> >>>> #<Inet4Address 0.0.0.0/0.0.0.0>
> >>>> user=>
> >>>>
> >>>> Then I check on a CentOS vm, the unit tests still fail but the
> address resolution also worked in the repl and produced the same output as
> above.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Jul 12, 2013, at 9:30 PM, Ted Yu wrote:
> >>>>
> >>>>> I installed lein on Mac and Linux.
> >>>>>
> >>>>> I tried 'lein test' on both platforms and the test passed on both:
> >>>>>
> >>>>> lein test hbase.table-test
> >>>>> Starting DataNode 0 with dfs.data.dir:
> /homes/hortonzy/mobius/target/test-data/246828b9-1be9-4949-9bbc-b215b378fb67/dfscluster_9ed0bd88-d309-4fed-9823-3bbf86973ae4/dfs/data/data1,/homes/hortonzy/mobius/target/test-data/246828b9-1be9-4949-9bbc-b215b378fb67/dfscluster_9ed0bd88-d309-4fed-9823-3bbf86973ae4/dfs/data/data2
> >>>>> Cluster is active
> >>>>>
> >>>>> Ran 11 tests containing 14 assertions.
> >>>>> 0 failures, 0 errors.
> >>>>>
> >>>>> Here is Linux OS:
> >>>>>
> >>>>> Linux a.net 2.6.32-220.23.1.el6.YAHOO.20120713.x86_64 #1 SMP Fri
> Jul 13 11:40:51 CDT 2012 x86_64 x86_64 x86_64 GNU/Linux
> >>>>>
> >>>>> Looking at MiniDFSCluster.java, line 426:
> >>>>>
> >>>>>       String ipAddr = dn.getSelfAddr().getAddress().getHostAddress();
> >>>>>
> >>>>> It seems dn.getSelfAddr().getAddress() returned null.
> >>>>>
> >>>>> According to:
> >>>>>
> http://docs.oracle.com/javase/7/docs/api/java/net/InetSocketAddress.html#getAddress()
> >>>>>
> >>>>> This would mean address resolution problem.
> >>>>>
> >>>>> Can you check ?
> >>>>>
> >>>>> Cheers
> >>>>>
> >>>>> On Fri, Jul 12, 2013 at 7:37 PM, David Williams <
> mobiusinversion@gmail.com> wrote:
> >>>>> Hi all,
> >>>>>
> >>>>> I am having an issue starting the a mini cluster for the
> HBaseTestingUtility.  In short I can on Mac OSX, but cannot on Linux.   But
> the error is cryptic and I don't know what to do next.
> >>>>>
> >>>>> I submitted a ticket with full details on StackOverflow,
> >>>>>
> >>>>>
> http://stackoverflow.com/questions/17625938/hbase-minidfscluster-java-fails-in-certain-environments
> >>>>>
> >>>>> But when I call .startMiniCluster  on an instance of
> HBaseTestingUtility , on Linux (CentOS x86_64), I receive this error:
> >>>>>
> >>>>>
> >>>>>
> >>>>> ERROR in (create-table) (MiniDFSCluster.java:426)
> >>>>> Uncaught exception, not in assertion.
> >>>>> expected: nil
> >>>>>   actual: java.lang.NullPointerException: null
> >>>>>  at org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes
> (MiniDFSCluster.java:426)
> >>>>>     org.apache.hadoop.hdfs.MiniDFSCluster.<init>
> (MiniDFSCluster.java:284)
> >>>>>     org.apache.hadoop.hbase.HBaseTestingUtility.startMiniDFSCluster
> (HBaseTestingUtility.java:444)
> >>>>>     org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster
> (HBaseTestingUtility.java:612)
> >>>>>     org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster
> (HBaseTestingUtility.java:568)
> >>>>>     org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster
> (HBaseTestingUtility.java:555)
> >>>>>
> >>>>> I would appreciate help in finding out whats going on and how to
set
> up my ENV to use the HBaseTestingUtility.
> >>>>>
> >>>>> Thanks
> >>>>> David
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>>
> >>
> >>
> >
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message