hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dhruba Borthakur <dhr...@gmail.com>
Subject Re: Profiling HDFS
Date Mon, 03 Jan 2011 05:36:57 GMT
when a datanode dies, any write pipeline that was using that datanode gets
affected to a certain extent. The writer goes through an error recovery
protocol that could introduce delays in the write pipeline. On the other
hand, other write pipelines that do not encompass the dead datanode should
not be impacted at all.


On Wed, Dec 29, 2010 at 2:57 AM, Rajat Goel <rajatgoel06@gmail.com> wrote:

> I am opening a new file every 5 mins. For every 5 mins, I keep writing to a
> file, then I close the current file and open a new file for writing. My
> block size is 256 MB. Replication factor is 2.
> This is my test scenario: I am using a cluster of 6 machines (1 namenode, 5
> datanodes). On each datanode, I am running two threads (one writing to HDFS
> @ 10MB/s and other reading from HDFS @ 20 MB/s.) I shutdown one of the
> datanodes manually and I see that my write thread on live datanodes is no
> longer able to write @10 MB/s to HDFS, write speed becomes slow.The problem
> is writes on live datanodes get affected by a datanode going dead.
> I suspect that this may be due to live nodes trying to replicate their
> blocks on dead datanode. I see java.io exceptions on terminal of live
> datanodes saying bad ack from the dead machine.
> Can you please tell us what how exactly writes and replication behave when
> a datanode goes down?
> Regards,
> Rajat
> On Wed, Dec 29, 2010 at 11:17 AM, Dhruba Borthakur <dhruba@gmail.com>wrote:
>> how frequently do you open new files to write? Or do you continue to write
>> to the same file(s) for the entire duration of the test? what is ur block
>> size? can you pl elaborate on your test workload?
>> On Tue, Dec 28, 2010 at 9:45 PM, Rajat Goel <rajatgoel06@gmail.com>wrote:
>>> Hi,
>>> I want to measure read/write rates to HDFS under various conditions such
>>> as under heavy load or one data node goes down etc? Is there some profiler
>>> already available for such purpose?
>>> I am pushing data at high rate to HDFS, reads are also happening in
>>> parallel and I suddenly reboot one datanode. I observe that I am no longer
>>> able to write to HDFS (from live datanodes) at the same higher rate. This
>>> happens for few minutes (around 30 mins), after which things go back to
>>> normal again. I want to find out why HDFS becomes slow, what is the main
>>> contributor of this latency and can I improve this behavior by changing some
>>> configuration parameters.
>>> Thanks & Regards,
>>> Rajat
>> --
>> Connect to me at http://www.facebook.com/dhruba

Connect to me at http://www.facebook.com/dhruba

View raw message