mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suneel Marthi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAHOUT-1525) train/validateAdaptiveLogistic
Date Thu, 24 Apr 2014 22:19:16 GMT

    [ https://issues.apache.org/jira/browse/MAHOUT-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13980395#comment-13980395
] 

Suneel Marthi commented on MAHOUT-1525:
---------------------------------------

You shouldn't be using 0.7, 0.7 has long been retired and is unsupported. Please upgrade to
0.9 or work off of trunk.

> train/validateAdaptiveLogistic
> ------------------------------
>
>                 Key: MAHOUT-1525
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1525
>             Project: Mahout
>          Issue Type: Question
>          Components: Classification
>    Affects Versions: 0.7
>            Reporter: Richard Scharrer
>              Labels: adaptiveLogisticRegression,, newbie
>
> Hi,
> I tried to use train- and validateAdaptiveLogistic on my data which is like:
> category, id, var1, var2, ...var72 (all numeric)
> I used the following settings:
> mahout trainAdaptiveLogistic --input resource/trainingData \
> --output ./model \
> --target category --categories 9 \
> --predictors a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 .....
> --types numeric \
> --passes 100 \
> --showperf \
> mahout validateAdaptiveLogistic --input resource/testData --model model --confusion --defaultCategory
none
> The output of validateAdaptiveLogistic is:
> Log-likelihood:Min=-5.54, Max=-0.04, Mean=-1.58, Median=-1.33
> =======================================================
> Confusion Matrix
> -------------------------------------------------------
> a    	b    	d    	e    	f    	g    	h    	i    	<--Classified as
> 14   	0    	0    	0    	0    	0    	0    	0    	 |  14    	a     = projekt
> 0    	18   	0    	0    	0    	0    	0    	0    	 |  18    	b     = news/aktuelles/presse
> 0    	0    	24   	0    	0    	0    	0    	0    	 |  24    	d     = lehrveranstaltung
> 0    	0    	0    	19   	0    	0    	0    	0    	 |  19    	e     = publikation
> 0    	0    	0    	0    	20   	0    	0    	0    	 |  20    	f     = event
> 0    	0    	0    	0    	0    	14   	0    	0    	 |  14    	g     = mitarbeiter/person
> 0    	0    	0    	0    	0    	0    	44   	0    	 |  44    	h     = ├╝bersicht
> 0    	0    	0    	0    	0    	0    	0    	13   	 |  13    	i     = institut
> (in case you were wondering, the categories a in german)
> My problem is that this is impossible. I always get a perfect classification even with
just a little amount of training data. It doesnt even matter how many features I use I tried
it with all 72 and with only one. Am I missing something?
> Regards,
> Richard



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message