mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Drew Farris <drew.far...@gmail.com>
Subject Re: Re: M/R capturing line numbers in text files
Date Wed, 16 Jun 2010 23:54:38 GMT
Hi Shannon,

Could it be the method signature for the reduce method? The new api dictates
that the reduce method should take an Iterable instead of an Iterator.

The way it is written now, your reduce method would never be called and thus
the reducer input is simply passed along as output (default Reducer
behavior). This would account for the stack trace on line 32 of the output
on your post -- you've set the output class of the Reducer to be
VectorWritable, but since reduce isn't actually being called, it's getting
the MatrixEntryWritable unchanged from the Map job.

The detail level in the blog post is excellent. Unless I missed something I
think the current patch in jira is an earlier version of the code than that
referenced in your blog post? It is probably helpful to post updated patches
to jira too when you're stuck. That way others can give it a try too.

HTH,

Drew

On Wed, Jun 16, 2010 at 4:51 PM, Shannon Quinn <squinn@gatech.edu> wrote:

> For any interested, I made a blog posting about this issue; perhaps it will
> help elucidate the problem.
>
>
> http://spectrallyclustered.wordpress.com/2010/06/16/sprint-1-getting-the-hang-of-mapreduce/
>
> Thanks again!
>
> Shannon
>
>
> -------- Original Message --------
> Subject:        Re: M/R capturing line numbers in text files
> Date:   Wed, 16 Jun 2010 09:53:51 -0400
> From:   Shannon Quinn <squinn@gatech.edu>
> To:     dev@mahout.apache.org
>
>
>
> Perfect. Thank you.
>
> Unfortunately, now I receive this exception:
>
> java.io.IOException: wrong value class:
> org.apache.mahout.math.hadoop.DistributedRowMatrix$MatrixEntryWritable
> is not class org.apache.mahout.math.VectorWritable
>
> My Mapper's value output and Reducer's input is a
> DRM.MatrixEntryWritable, and is specified as such in the Conf object.
> The Reducer's output is a VectorWritable. The stack trace doesn't
> mention any code of mine, so I'm not sure how to approach this.
>
>   The basic problem is that something has produced data that uses a long as
>> an
>>  ID and your mapper is expecting an int.  Have you posted your code as a
>>  patch on the jira or a git link?
>>
>>
> I attached a patch to my project's ticket on jira (363).
>
> Thanks again!
>
> Regards,
> Shannon
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message