mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shannon Quinn <>
Subject M/R over two matrices, and computing the median
Date Fri, 30 Jul 2010 15:54:14 GMT
Hi all,

Two quick questions:

1) If I'm in a Mapper, and I'm trying to access two matrices of data (the
rows of one of them form the VectorWritables that are the input to the
Mapper; the other is a Path argument to the cache), how could I access the
same row in both matrices simultaneously? My first instinct is to use the
IntWritable key input and simply access that same row from the saved Path,
but I'm not sure how the SequenceFile index schemes are set up. For example,
if I have two DistributedRowMatrices, would the same key reference the same
row in both?

2) I looked through the Mahout math package and nothing stood out: is there
an easy way for computing the median value of a Vector?



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message