hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Viner <davevi...@gmail.com>
Subject Re: Data for Testing in Hadoop
Date Wed, 05 Jan 2011 03:45:32 GMT
Also, Amazon offers free public data sets at:

http://aws.amazon.com/datasets?_encoding=UTF8&jiveRedirect=1




On Tue, Jan 4, 2011 at 7:28 PM, Lance Norskog <goksron@gmail.com> wrote:

> https://cwiki.apache.org/confluence/display/MAHOUT/Collections
>
> All the collections you can imagine.
>
> On Tue, Jan 4, 2011 at 12:28 AM, Harsh J <qwertymaniac@gmail.com> wrote:
> > You can use MR to generate the data itself. Checkout GridMix in
> > Hadoop, or PigMix from Pig for examples on general load tests.
> >
> > On Tue, Jan 4, 2011 at 1:01 PM, Adarsh Sharma <adarsh.sharma@orkash.com>
> wrote:
> >> Dear all,
> >>
> >> Designing the architecture is very important for the Hadoop in
> Production
> >> Clusters.
> >>
> >> We are researching to run Hadoop in Cloud in Individual Nodes and in
> Cloud
> >> Environment ( VM's ).
> >>
> >> For this, I require some data for testing. Would anyone send me some
> links
> >> for data of different sizes ( 10Gb, 20GB, 30 Gb , 50GB ) .
> >> I shall be grateful for this kindness.
> >>
> >>
> >> Thanks & Regards
> >>
> >> Adarsh Sharma
> >>
> >>
> >
> >
> >
> > --
> > Harsh J
> > www.harshj.com
> >
>
>
>
> --
> Lance Norskog
> goksron@gmail.com
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message