hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Limotte <mslimo...@gmail.com>
Subject Re: hbase-user Digest 4 Dec 2009 20:53:21 -0000 Issue 763
Date Mon, 07 Dec 2009 18:37:21 GMT
As a data point, I'll describe how I'm doing my unit tests that bypass the
need to start-up hbase.

I have two ways of using HBase, 1) as a
Cascading<http://www.cascading.org/>Sink (i.e. just loading rows of
data in bulk), and 2) Direct access for
lookups, deletes and updates to individual entries.

The first case is easy.  In the unit test, we just swap out the Cascading
HBase Sink (output tap) for a simple Lfs/SequenceFile sink (i.e. a file on
the local file system).  Then I can use standard Cascading methods to open
the Tap and verify it's contents.

In the second case, all of our HBase access is encapsulated using a DAO
pattern.  It has typical methods like findByX, delete, update and so on.  To
test, we replace the DAO with a a Stub for the DAO, that keeps track of what
delete and update requests were made of it; and has canned data to return
for the findBy methods.

This all seems to work pretty well. We can test all of our logic with really
fast tests that do not require HBase to be started up, and do not rely on
the state in HBase.  I have a few additional "Integration" tests, that do
test out the actual HBase integration to make sure were using the APIs
right.  Obviously, we run these less frequently.

Of course, this isn't really a Mock for HBase, as we stub out before the
HBase APIs.

One other note: this test code is all written in Groovy.

Marc Limotte



---------- Forwarded message ----------
> From: stack <stack@duboce.net>
> To: hbase-user@hadoop.apache.org
> Date: Fri, 4 Dec 2009 11:23:35 -0800
> Subject: Re: Please advise: best way to write a unit test for MapRed HBase
> case
> Dave:
>
> Sounds like a little mock MR framework.  Would it be worth developing
> further?  Does your little mock framework do multi-region tables?
>
> In general, its about time we started up the conversation about what mock
> objects we'd want to make testing faster/easier.
>
> St.Ack
>
> On Thu, Dec 3, 2009 at 3:19 PM, Dave Latham <latham@davelink.net> wrote:
>
> > Hi Steve,
> >
> > I bumped in to the same issue in wanting to test Map Reduce jobs that
> read
> > from and write to HBase tables more quickly.  What I ended up doing is
> > creating custom InputFormat and OutputFormat implementations that wrap
> the
> > TableInputFormat / TableOutputFormat and convert the hbase data to/from
> > object representations.  Then the map / reduce classes expect the objects
> > directly.  Then I have a test input and output format that provide those
> > objects directly from memory and when testing the jobs configure them to
> > use
> > these test input formats.  The tests can then run faster.  Of course, you
> > need to be careful to test your input / output formats separately as well
> > as
> > the object / hbase conversion code.
> >
> > I'm also interested in any techniques other people use to speed up tests
> > that need to interact with HBase.
> >
> > Dave
> >
> > On Thu, Dec 3, 2009 at 3:10 PM, Steve Kuo <kuosenhao@yahoo.com> wrote:
> >
> > > I have a class that does a regular map job and a TableReduce based
> reduce
> > > job.  This class works when called as the main class either from
> eclipse
> > or
> > > on my pseudo cluster as long as hbase is up and running.  I like to
> write
> > a
> > > unit test for it and like advices on the best way to proceed.
> > >
> > > The best I came up with after googling for "hbase unit test" is a page
> > that
> > > suggest looking at org.apache.hadoop.hbase.TestTableMapReduce.  I was
> > able
> > > to get this class to run after adding additional classes in:
> > >
> > > * org.apache.hadoop.hbase
> > > * org.apache.hadoop.hdfs
> > > * org.apache.hadoop.hdfs.server.datanode
> > > * org.apache.hadoop.mapred
> > > * org.apahce.hadoop.net
> > > * jetty-6.1.14.jar
> > >
> > > After all this, the test worked but it was very slow as it had to start
> > up
> > > mini-cluster for dfs and etc.  It seemed excessive that jetty was
> needed.
> > >
> > > Please advise on whether there is a simpler way to do unit test.
> > >
> > > Thanks in advance.
> > >
> > >
> > >
> > >
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message