datafu-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jian wang (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (DATAFU-21) Probability weighted sampling without reservoir
Date Sun, 16 Feb 2014 13:05:20 GMT

     [ https://issues.apache.org/jira/browse/DATAFU-21?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

jian wang reassigned DATAFU-21:
-------------------------------

    Assignee: jian wang

> Probability weighted sampling without reservoir
> -----------------------------------------------
>
>                 Key: DATAFU-21
>                 URL: https://issues.apache.org/jira/browse/DATAFU-21
>             Project: DataFu
>          Issue Type: New Feature
>         Environment: Mac OS, Linux
>            Reporter: jian wang
>            Assignee: jian wang
>
> This issue is used to track investigation on finding a weighted sampler without using
internal reservoir. 
> At present, the SimpleRandomSample has implemented a good acceptance-rejection sampling
algo on probability random sampling. The weighted sampler could utilize the simple random
sample with slight modification.
> One slight modification is:  the present simple random sample generates a uniform random
number lies between (0, 1) as the random variable to accept or reject an item. The weighted
sample may generate this random variable based on the item's weight and this random number
still lies between (0, 1) and each item's random variable remain independent between each
other.
> Need further think and experiment the correctness of this solution and how to implement
it in an effective way.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message