mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Eastman <>
Subject Re: tf(idf)-ing the cluster output
Date Tue, 07 Feb 2012 16:32:14 GMT
Sure, love to hear more about your use case and pipeline. Can you 
describe the steps you are performing and how the results get utilized?


On 2/7/12 9:28 AM, Viktor Gal wrote:
> Hi,
> ::: i'm using mahout for computer vision, so my pipeline is a bit different from the
text processing pipeline, i.e. after i've acquired the feature vectors i'm doing a clustering
and after i've got the cluster centers and clustered the original feature vectors i'm doing
the TF(IDF) vector calculation. This is a quite standard thing nowadays in computer vision...
> so i've implemented the part for creating TF(IDF) vectors from the cluster output, based
on DocumentVectorizer class. if anybody thinks that it'd be good to have this tool in mahout
let me know so i'll create an issue for it JIRA and upload there my patches.
> cheers,
> viktor

  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message