hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Hall <d...@cs.stanford.edu>
Subject Testing Mappers in Hadoop 0.20.0
Date Wed, 22 Jul 2009 21:49:15 GMT

I'm a student working with Apache Mahout for the Google Summer of
Code. We recently moved to 0.20.0, and I was porting my code to the
new API. Unfortunately, I (and the whole project team) seem to have
run into a problem when it comes to testing them.

Historically, we would create a Mapper in a unit test, and a special
"DummyOutputCollector", which was essentially a multimap dressed up to
conform to OutputCollector. In Hadoop 0.20.0, this isn't possible
anymore, because Mappers take an instance of an inner class.

It's of course possible to dress up the Context in something else
(say, something just like an OutputCollector), and to specify that
Mahout Mappers should just delegate to a method that takes an
OutputCollector. But, this seems to not be very idiomatic.

All this goes to say, what would be a "best practice" for testing
Mappers and Reducers in 0.20.0?

David Hall

View raw message