mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniele Volpi <>
Subject Decision Forest and text classification
Date Sun, 12 Feb 2012 17:03:33 GMT
Hi everyone,
I'd like to run the Decision Forest classifier on the 20 newsgroups dataset.
According to the documentation, the Mahout implementation accepts only
numerical or categorical attributes, so, the only way to do it is
transforming the documents in fixed lenght vectors (maybe using tf-idf as
numerical values) plus one cell for label and put them in csv files. Is it
What are the simplest steps to do it?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message