mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raphael Cendrillon <cendrillon1...@gmail.com>
Subject Re: [jira] [Commented] (MAHOUT-904) SplitInput should support randomizing the input
Date Fri, 23 Dec 2011 16:34:28 GMT
Thanks Sean. Currently I'm thinking of reading out the current key class from the SequenceFile
and just propagating it through. Do you think that's reasonable?

On Dec 23, 2011, at 4:52 AM, "Sean Owen (Commented) (JIRA)" <jira@apache.org> wrote:

> 
>    [ https://issues.apache.org/jira/browse/MAHOUT-904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175408#comment-13175408
] 
> 
> Sean Owen commented on MAHOUT-904:
> ----------------------------------
> 
> (I don't know if this is a relevant comment, but we ought to be using VarIntWritable
and VarLongWritable, not IntWritable and LongWritable, for better space savings.)
> 
>> SplitInput should support randomizing the input
>> -----------------------------------------------
>> 
>>                Key: MAHOUT-904
>>                URL: https://issues.apache.org/jira/browse/MAHOUT-904
>>            Project: Mahout
>>         Issue Type: Improvement
>>           Reporter: Grant Ingersoll
>>           Assignee: Raphael Cendrillon
>>             Labels: MAHOUT_INTRO_CONTRIBUTE
>>        Attachments: MAHOUT-904.patch, MAHOUT-904.patch, MAHOUT-904.patch, MAHOUT-904.patch,
MAHOUT-904.patch, MAHOUT-904.patch
>> 
>> 
>> For some learning tasks, we need the input to be randomized (SGD) instead of blocks
of labels all at once.  SplitInput is a useful tool for setting up train/test files but it
currently doesn't support randomizing the input.
> 
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
> 
> 

Mime
View raw message