mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mat Kelcey <matthew.kel...@gmail.com>
Subject Re: one vector or many vectors?
Date Thu, 01 Nov 2012 14:45:08 GMT
>    I am using sgd classifier for our articles classification.I want to
> train a new model,but there is a problem.I can provide the learner a large
> article or some small articles, but i extract only one vector for one
> article.Then i don't know is  there any difference between one vector and
> many vectors for learner when training? Should i provide the learner one
> large article or many small articles?


i'm not sure i understand your question, but i guess you're saying that
each article is a separate training example?

in terms of differing  lengths you might want to try some different
normalisation approaches but i'd try without anything first.

http://nlp.stanford.edu/IR-book/html/htmledition/variant-tf-idf-functions-1.html
is
a good place to start

mat

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message