hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ravi Mutyala <r...@hortonworks.com>
Subject Re: Getting started recommendations
Date Fri, 11 Jan 2013 11:20:53 GMT
On Fri, Jan 11, 2013 at 4:29 AM, John Lilley <john.lilley@redpoint.net>wrote:

> Where would we find some “big data” files that people have used for
> testing purposes?


Some of the most commonly used 'Big Data' files for testing are Global
Weather Data from NCDC (ftp://ftp.ncdc.noaa.gov/pub/data/gsod), Enron
emails, Airline data. You can just look at
http://aws.amazon.com/datasets/to see what interests you the most.

Mime
View raw message