hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rajat Goel <rajatgoe...@gmail.com>
Subject Re: Profiling HDFS
Date Wed, 29 Dec 2010 10:57:17 GMT
I am opening a new file every 5 mins. For every 5 mins, I keep writing to a
file, then I close the current file and open a new file for writing. My
block size is 256 MB. Replication factor is 2.

This is my test scenario: I am using a cluster of 6 machines (1 namenode, 5
datanodes). On each datanode, I am running two threads (one writing to HDFS
@ 10MB/s and other reading from HDFS @ 20 MB/s.) I shutdown one of the
datanodes manually and I see that my write thread on live datanodes is no
longer able to write @10 MB/s to HDFS, write speed becomes slow.The problem
is writes on live datanodes get affected by a datanode going dead.

I suspect that this may be due to live nodes trying to replicate their
blocks on dead datanode. I see java.io exceptions on terminal of live
datanodes saying bad ack from the dead machine.

Can you please tell us what how exactly writes and replication behave when a
datanode goes down?

Regards,
Rajat

On Wed, Dec 29, 2010 at 11:17 AM, Dhruba Borthakur <dhruba@gmail.com> wrote:

> how frequently do you open new files to write? Or do you continue to write
> to the same file(s) for the entire duration of the test? what is ur block
> size? can you pl elaborate on your test workload?
>
>
> On Tue, Dec 28, 2010 at 9:45 PM, Rajat Goel <rajatgoel06@gmail.com> wrote:
>
>> Hi,
>>
>> I want to measure read/write rates to HDFS under various conditions such
>> as under heavy load or one data node goes down etc? Is there some profiler
>> already available for such purpose?
>>
>> I am pushing data at high rate to HDFS, reads are also happening in
>> parallel and I suddenly reboot one datanode. I observe that I am no longer
>> able to write to HDFS (from live datanodes) at the same higher rate. This
>> happens for few minutes (around 30 mins), after which things go back to
>> normal again. I want to find out why HDFS becomes slow, what is the main
>> contributor of this latency and can I improve this behavior by changing some
>> configuration parameters.
>>
>> Thanks & Regards,
>> Rajat
>>
>
>
>
> --
> Connect to me at http://www.facebook.com/dhruba
>

Mime
View raw message