hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Viner <davevi...@gmail.com>
Subject Re: Data for Testing in Hadoop
Date Tue, 04 Jan 2011 07:37:46 GMT
How about http://aws.amazon.com/datasets?_encoding=UTF8&jiveRedirect=1 ?

Just the first one (WestburyLab USENET corpus) is 40GB.  I suspect you can
find different formats and data sizes there.

Dave Viner


On Mon, Jan 3, 2011 at 11:31 PM, Adarsh Sharma <adarsh.sharma@orkash.com>wrote:

> Dear all,
>
> Designing the architecture is very important for the Hadoop in Production
> Clusters.
>
> We are researching to run Hadoop in Cloud in Individual Nodes and in Cloud
> Environment ( VM's ).
>
> For this, I require some data for testing. Would anyone send me some links
> for data of different sizes ( 10Gb, 20GB, 30 Gb , 50GB ) .
> I shall be grateful for this kindness.
>
>
> Thanks & Regards
>
> Adarsh Sharma
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message