mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shannon Quinn <>
Subject Re: M/R capturing line numbers in text files
Date Thu, 17 Jun 2010 03:38:01 GMT
Hi Drew,

Amazing! That fixed it! I figured it was something that simple; I guess 
in refactoring my first attempt, which used the older API, one tiny 
element (Iterable vs Iterator? wow) fell through.

> The way it is written now, your reduce method would never be called and thus
> the reducer input is simply passed along as output (default Reducer
> behavior). This would account for the stack trace on line 32 of the output
> on your post -- you've set the output class of the Reducer to be
> VectorWritable, but since reduce isn't actually being called, it's getting
> the MatrixEntryWritable unchanged from the Map job.

I know that I noticed a few weeks ago that "reduce()" wasn't abstract, 
but somehow got it in my head since then that it was, and hence the 
program wouldn't run without an implementation. I see that the default 
implementation is the identity function, but even so, still seems to me 
something that should be abstract.

> The detail level in the blog post is excellent. Unless I missed something I
> think the current patch in jira is an earlier version of the code than that
> referenced in your blog post? It is probably helpful to post updated patches
> to jira too when you're stuck. That way others can give it a try too.

Thank you! I'd like to keep both it and the google repo I'm using to 
mirror my code specifically as up-to-date as possible. I believe the 
jira patch was indeed an earlier version, but I went ahead and added the 
patch using the fix you presented, and will do so in the future when I 
run into problems.

Thank you again! I figured it would be grossly simple (literally, change 
3 letters in the method header), but I greatly appreciate it.


View raw message