mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lancaster, Robert (Orbitz)" <>
Subject NaiveBayes and Classification of non-documents
Date Thu, 02 Jun 2011 14:40:25 GMT
I'm looking at the Mahout implementation NaiveBayes for a classification task, but the language
around the Mahout implementation appears to be document-centric.  Is it possible to use the
Mahout implementation of NB for a classification task that doesn't involve documents?

I have about 80 million records with a small number of features.  The arff header looks like
(the numeric features could easily be nominalized if need be):

@RELATION        relation
@ATTRIBUTE      featurea    NUMERIC
@ATTRIBUTE      featureb    {1,2,3,4,5,6,7}
@ATTRIBUTE      featurec     {1,2,3,4,5,6,7}
@ATTRIBUTE      featured     NUMERIC
@ATTRIBUTE      featuref        NUMERIC
@ATTRIBUTE      featuref {0,1}
@ATTRIBUTE      target  {0,1}

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message