hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael G. Noll" <michael.g.n...@googlemail.com>
Subject Summarizing instructions to run HBase 0.90.2 on Hadoop 0.20.x, feedback appreciated
Date Tue, 12 Apr 2011 11:38:15 GMT
Hi all,

as a few other people on this mailing list I am currently working on
getting HBase up and running on Hadoop 0.20.2.  I think I have by now
read most of the relevant past discussions on this topic here, e.g.
St.Ack's thread on creating an append release [6], Mike Spreitzner's
recent attempt [3] at making HBase work on Hadoop 0.20.x that made it
into the HBase docs [2], or the recent discussion in [8] when 0.90.2
was about to be released last week.

St.Ack mentioned that Hadoop 0.22 might be the first release with append
support out of the box [7].  However, on our side we are stuck for the
time being on the production-ready 0.20.x branch, so waiting until
Hadoop 0.22 or rather 0.23 [9] (and HBase 0.92) are eventually released
is not an option. :-/

So in order to help myself and hopefully also other readers of this
mailing list, I try to summarize my steps so far to understand and build
Hadoop 0.20-append for use with HBase 0.90.2, the problems I have run
into, and I'll also list the pending issues and roadblocks that I haven't
solved yet.

- I checked out branch-0.20-append [1] according to the HBase instructions
  at [2] and run a successful build via "ant mvn-install" [5].
- I inspected the code history of 0.20.2 release and branch-0.20-append
  (git show-branch release-0.20.2 branch-0.20-append) and noticed that
  the append branch is based on release 0.20.2.  In other words, there is
  not a single commit in 0.20.2 release that is not also in
  branch-0.20-append. Good!
- FWIW, I compared the Hadoop JAR file shipped with HBase 0.90.1/0.90.2
  (hadoop-core-0.20-append-r1056497.jar) with the one I built from the
  latest version of branch-0.20-append.  I noticed that the JAR file in
  HBase seems to miss the latest commit for HDFS-1554 (SVN rev 1057313
  aka git commit df0d79cc). In git terms, the Hadoop JAR file shipped
  in HBase is based on HEAD^1 of branch-0.20-append.  Is there a reason
  for not including the latest commit?
- I also discovered (like Mike Spreitzner did [3]) that there is a
  BlockChannel.class file in HBase's Hadoop JAR file that seems to come
  "out of nowhere".  I haven't found it or a reference to it anywhere in
  the source code.  I decompiled the class [4], and it appears to be an
  innocent file, maybe used for debugging. A build artifact?

Then I tried two different builds:

1) A first build to replicate and test the Hadoop JAR shipped with HBase
   0.90.{1,2}, using all commit history up to SVN rev 1056491 aka git
   e499be8.  The last commit being "HDFS-1555 ..." from 07-Jan-11.
   In git terms, this is a build based on HEAD^1.
2) A second build to create the current version of the Hadoop append
   branch, using all commit history up to SVN rev 1057313 aka git
   df0d79cc.  The last commit is "HDFS-1554 ..." from 10-Jan-11.
   In git terms, this is a build based on HEAD, i.e. the latest version
   of branch-0.20-append.

Here are my findings:

1) When I run "ant test" for the append branch version apparently used by
   HBase 0.90.{1,2}, I consistently run into a build error in
   TestFileAppend4, logged to
   build/test/TEST-org.apache.hadoop.hdfs.TestFileAppend4.txt.
   Details are available at [10].
2) When I run "ant test" for the latest version of the append branch, I
   get the same error as before. However, I sometimes -- not always -- get
   additional failures/errors for
    * TEST-org.apache.hadoop.hdfs.server.namenode.TestEditLogRace.txt [11]
    * TEST-org.apache.hadoop.hdfs.TestMultiThreadedSync.txt [12]
   both of which look like "general" errors to me.  Maybe a problem of
   the machine I'm running the build and the tests on?

This leads me to two questions:

1. Are the test errors described above a known issue that can be ignored?
   Or did I miss something when building the append branch?
   From what I have read, my build process should have produced an Hadoop
   JAR file that is equivalent to the one shipped with HBase.  So any
   error during my tests should have surfaced for the HBase build, too.

2. Is there a way to test whether my custom build is "correct"?  In other
   words, how can I find out whether the append/syncing works properly
   so that it does not come to a data loss in HBase at some point.
   Unfortunately, I haven't found any instructions to intentionally
   create such a data-loss scenario for verifying whether Hadoop/HBase
   handles it properly.  St.Ack, for instance, only talks about some
   basic tests he did himself [13].
   I know someone already asked this question before without receiving
   a good answer but hey -- there's always hope. :-)


Any feedback or pointers would be greatly appreciated!

I'm happy to experiment and to report back.  Since St.Ack's suggestion
to make a quick, official "append-ready" release of Hadoop for HBase [6]
was not pursued (I do not want to restart a discussion here), at least I
would like to help the community with a set of easy-to-follow instructions
for other people to get HBase and Hadoop 0.20.x up and running.

Best,
Michael


PS: And congratulations for getting 0.90.2 out. Your work is really
appreciated! :-)


[1] http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-append/
[2] http://hbase.apache.org/book/notsoquick.html#hadoop
[3] http://search-hadoop.com/m/mfUkf2EEiaf
[4] http://pastebin.ubuntu.com/587699/
[5] http://wiki.apache.org/hadoop/GitAndHadoop
[6] http://www.mail-archive.com/general@hadoop.apache.org/msg02543.html
[7] http://www.mail-archive.com/user@hbase.apache.org/msg06772.html
[8] http://www.mail-archive.com/user@hbase.apache.org/msg07060.html
[9] http://www.mail-archive.com/common-dev@hadoop.apache.org/msg02785.html
[10] http://pastebin.ubuntu.com/593073/
[11] http://pastebin.ubuntu.com/593075/
[12] http://pastebin.ubuntu.com/593076/
[13] http://www.mail-archive.com/user@hbase.apache.org/msg07158.html

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message