mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen (JIRA)" <>
Subject [jira] Commented: (MAHOUT-212) Need random sampler for use in reducers
Date Mon, 07 Dec 2009 18:29:18 GMT


Sean Owen commented on MAHOUT-212:

Yeah test injection was the idea behind using RandomUtils, since it will return a generator
that uses the same seed every time when set in test mode. The unit tests do (should) set it
globally as such, to make sure the results are deterministic. Yes the returned generator is
a MersenneTwisterRNG which just extends Random.

Yes anything for testing should probably be package-private.

(I'd also suggest making the instance fields private here? not sure there's a big case for
extension, at least, one that isn't perhaps better answered with explicit getters)

I dont' care about the test naming convention.

Once this is in place I'll put my similar Iterator next to it.

> Need random sampler for use in reducers
> ---------------------------------------
>                 Key: MAHOUT-212
>                 URL:
>             Project: Mahout
>          Issue Type: Bug
>          Components: Utils
>    Affects Versions: 0.2
>            Reporter: Ted Dunning
>            Assignee: Sean Owen
>             Fix For: 0.3
>         Attachments: MAHOUT-212.patch
> For a variety of mining algorithms, it helps to have a uniform way to only process a
sub-set of the records in a reducer.
> As such, I have written a simple generic sampler that filters an Iterator returning a
fair sample of at most a specified size.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message