mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zach Richardson <z...@raveldata.com>
Subject Re: mahout LDA Data Set
Date Sat, 24 Sep 2011 12:51:20 GMT
What it sounds like you need to do, is separate your articles into two
groups, i.e. travel, and not travel.  Then I would use one of the
classifiers, i.e. naive bayes or logistic regression to train a classifier,
not LDA.

I would recommend getting the Mahout in Action book, or looking at the
twenty news groups examples in the Mahout code.

If you have very few training examples, you might also have good luck using
a Linear SVM like liblinear.

Zach


On Sat, Sep 24, 2011 at 2:03 AM, Biju Balakrishnan <bijubkbk@gmail.com>wrote:

> > Are your newspaper articles only travel, and you want to categorize them
> > into finer categories? Or do you have a ton of newspaper articles and
> want
> > to identify articles that are about travel within those articles?
> >
>
> I just have Articles of all categories and just need to identify Travel
> articles.
>



-- 
Zach Richardson
Ravel, Co-founder
Austin, TX
zach@raveldata.com
512.825.6031

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message