hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Bockelman <bbock...@cse.unl.edu>
Subject Re: HDFS data transfer!
Date Fri, 12 Jun 2009 18:24:24 GMT
What's your replication factor?  What aggregate I/O rates do you see  
in Ganglia?  Is the I/O spikey, or has it plateaued?

We can hit close to network rate (1Gbps) per node locally, and have  
pretty similar hardware.

Brian

On Jun 12, 2009, at 9:03 AM, Scott wrote:

> I ran the put command on 3 of the nodes simultaneously to copy files  
> that were local on those machines into the hdfs.
>
> Brian Bockelman wrote:
>> What'd you do for the tests?  Was it a single stream or a multiple  
>> stream test?
>>
>> Brian
>>
>> On Jun 12, 2009, at 6:48 AM, Scott wrote:
>>
>>> So is ~ 1GB/minute transfer rate a reasonable performance  
>>> benchmark?  Our test cluster consists of 4 quad core xeon machines  
>>> with 2 non-raided drives each.  My initial tests show a transfer  
>>> rate of around 1GB/minute, and that was slower that I expected it  
>>> to be.
>>>
>>> Thanks,
>>> Scott
>>>
>>>
>>> Brian Bockelman wrote:
>>>> Hey Sugandha,
>>>>
>>>> Transfer rates depend on the quality/quantity of your hardware  
>>>> and the quality of your client disk that is generating the data.   
>>>> I usually say that you should expect near-hardware-bottleneck  
>>>> speeds for an otherwise idle cluster.
>>>>
>>>> There should be no "make it fast" required (though you should  
>>>> reviewi the logs for errors if it's going slow).  I would expect  
>>>> a 5GB file to take around 3-5 minutes to write on our cluster,  
>>>> but it's a well-tuned and operational cluster.
>>>>
>>>> As Todd (I think) mentioned before, we can't help any when you  
>>>> say "I want to make it faster".  You need to provide diagnostic  
>>>> information - logs, Ganglia plots, stack traces, something - that  
>>>> folks can look at.
>>>>
>>>> Brian
>>>>
>>>> On Jun 10, 2009, at 2:25 AM, Sugandha Naolekar wrote:
>>>>
>>>>> But if I want to make it fast, then??? I want to place the data  
>>>>> in HDFS and
>>>>> reoplicate it in fraction of seconds. Can that be possible. and  
>>>>> How?
>>>>>
>>>>> On Wed, Jun 10, 2009 at 2:47 PM, kartik saxena <kartik.sxn@gmail.com

>>>>> > wrote:
>>>>>
>>>>>> I would suppose about 2-3 hours. It took me some 2 days to load 

>>>>>> a 160 Gb
>>>>>> file.
>>>>>> Secura
>>>>>>
>>>>>> On Wed, Jun 10, 2009 at 11:56 AM, Sugandha Naolekar
>>>>>> <sugandha.n87@gmail.com>wrote:It
>>>>>>
>>>>>>> Hello!
>>>>>>>
>>>>>>> If I try to transfer a 5GB VDI file from a remote host(not a
 
>>>>>>> part of
>>>>>> hadoop
>>>>>>> cluster) into HDFS, and get it back, how much time is it  
>>>>>>> supposed to
>>>>>> take?
>>>>>>>
>>>>>>> No map-reduce involved. Simply Writing files in and out from
 
>>>>>>> HDFS through
>>>>>> a
>>>>>>> simple code of java (usage of API's).
>>>>>>>
>>>>>>> -- 
>>>>>>> Regards!
>>>>>>> Sugandha
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -- 
>>>>> Regards!
>>>>> Sugandha
>>>>
>>


Mime
View raw message