hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Nauroth <cnaur...@hortonworks.com>
Subject Re: I need some raw big data
Date Fri, 07 Dec 2012 21:55:52 GMT
Another suggestion is Google Books Ngrams:

http://storage.googleapis.com/books/ngrams/books/datasetsv2.html


On Fri, Dec 7, 2012 at 7:57 AM, Phillip Rhodes <motley.crue.fan@gmail.com>wrote:

> On Fri, Dec 7, 2012 at 10:48 AM, Harsh J <harsh@cloudera.com> wrote:
> >
> > On Fri, Dec 7, 2012 at 8:31 PM, Yin Steve <steveyin92@gmail.com> wrote:
> >>  Hello, I'm Steve who need some raw big data for studying mapreduce
> >> programming. Where can i find them? especially those about weblog,
> traffic
> >> info etc. My English is not so well, if you can give me a URL which
> directly
> >> help me download the big file, That'll be great.
> >> Waiting for your reply......
>
> Try some of the links off of this Quora thread:
>
>
> http://www.quora.com/Data/Where-can-I-find-large-datasets-for-modeling-confidence-during-the-financial-crisis-which-is-open-to-the-public
>
> You might also try googling "Enron corpus".   Or check out CommonCrawl.org.
>
>
> Phil
>

Mime
View raw message