hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adaryl \"Bob\" Wakefield, MBA" <adaryl.wakefi...@hotmail.com>
Subject Re: Huge text file for Hadoop Mapreduce
Date Mon, 07 Jul 2014 23:32:52 GMT
http://www.cs.cmu.edu/~./enron/

Not sure the uncompressed size but pretty sure it’s over a Gig.

B.

From: navaz 
Sent: Monday, July 07, 2014 6:22 PM
To: user@hadoop.apache.org 
Subject: Huge text file for Hadoop Mapreduce

Hi

 

I am running basic word count Mapreduce code.  I have download a file Gettysburg.txt which
is of 1486bytes.  I have 3 datanodes and replication factor is set to 3. The data is copied
into all 3 datanodes but there is only one map task is running . All other nodes are ideal.
I think this is because I have only one block of data and single task is running. I would
like to download a bigger file say 1GB and want to test the network shuffling performance.
Could you please suggest me where can I download the huge text file. ?

 

 

Thanks & Regards

 

Abdul Navaz

 

Mime
View raw message