hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Zhang <zjf...@gmail.com>
Subject How to select random n records using mapreduce ?
Date Mon, 27 Jun 2011 07:11:23 GMT
Hi all,

I'd like to select random N records from a large amount of data using
hadoop, just wonder how can I archive this ? Currently my idea is that let
each mapper task select N / mapper_number records. Does anyone has such
experience ?

Best Regards

Jeff Zhang

View raw message