hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hairong <kuang.hair...@gmail.com>
Subject Re: [Discuss] Merge federation branch HDFS-1052 into trunk
Date Wed, 27 Apr 2011 17:46:51 GMT
Nice performance data! The federation branch definitely adds code
complexity to HDFS, but this is a long waited feature to improve HDFS
scalability and is a step forward to separating the namespace management
from the storage management. I am for merging this to trunk.

Hairong

On 4/27/11 10:02 AM, "suresh srinivas" <srini30005@gmail.com> wrote:

>I posted the TestDFSIO comparison with and without federation to
>HDFS-1052.
>Please let me know if it addresses your concern. I am also adding it here:
>
>TestDFSIO read tests
>*Without federation:*
>----- TestDFSIO ----- : read
>           Date & time: Wed Apr 27 02:04:24 PDT 2011
>       Number of files: 1000
>Total MBytes processed: 30000.0
>     Throughput mb/sec: 43.62329251162561
>Average IO rate mb/sec: 44.619869232177734
> IO rate std deviation: 5.060306158158443
>    Test exec time sec: 959.943
>
>*With federation:*
>----- TestDFSIO ----- : read
>           Date & time: Wed Apr 27 02:43:10 PDT 2011
>       Number of files: 1000
>Total MBytes processed: 30000.0
>     Throughput mb/sec: 45.657513857055456
>Average IO rate mb/sec: 46.72107696533203
> IO rate std deviation: 5.455125923399539
>    Test exec time sec: 924.922
>
>TestDFSIO write tests
>*Without federation:*
>----- TestDFSIO ----- : write
>           Date & time: Wed Apr 27 01:47:50 PDT 2011
>       Number of files: 1000
>Total MBytes processed: 30000.0
>     Throughput mb/sec: 35.940755259031015
>Average IO rate mb/sec: 38.236236572265625
> IO rate std deviation: 5.929484960036511
>    Test exec time sec: 1266.624
>
>*With federation:*
>----- TestDFSIO ----- : write
>           Date & time: Wed Apr 27 02:27:12 PDT 2011
>       Number of files: 1000
>Total MBytes processed: 30000.0
>     Throughput mb/sec: 42.17884674597227
>Average IO rate mb/sec: 43.11423873901367
> IO rate std deviation: 5.357057259968647
>    Test exec time sec: 1135.298
>{noformat}
>
>
>On Tue, Apr 26, 2011 at 11:55 PM, suresh srinivas
><srini30005@gmail.com>wrote:
>
>> Konstantin,
>>
>> Could you provide me link to how this was done on a big feature, like
>>say
>> append and how benchmark info was captured? I am planning to run dfsio
>> tests, btw.
>>
>> Regards,
>> Suresh
>>
>>
>> On Tue, Apr 26, 2011 at 11:34 PM, suresh srinivas
>><srini30005@gmail.com>wrote:
>>
>>> Konstantin,
>>>
>>> On Tue, Apr 26, 2011 at 10:26 PM, Konstantin Shvachko <
>>> shv.hadoop@gmail.com> wrote:
>>>
>>>> Suresh, Sanjay.
>>>>
>>>> 1. I asked for benchmarks many times over the course of different
>>>> discussions on the topic.
>>>> I don't see any numbers attached to jira, and I was getting the same
>>>> response,
>>>> Doug just got from you, guys: which is "why would the performance be
>>>> worse".
>>>> And this is not an argument for me.
>>>>
>>>
>>> We had done testing earlier and had found that performance had not
>>> degraded. We are waiting for out performance team to publish the
>>>official
>>> numbers to post it to the jira. Unfortunately they are busy qualifying
>>>2xx
>>> releases currently. I will get the perf numbers and post them.
>>>
>>>
>>>>
>>>> 2. I assume that merging requires a vote. I am sure people who know
>>>> bylaws
>>>> better than I do will correct me if it is not true.
>>>> Did I miss the vote?
>>>>
>>>
>>>
>>> As regards to voting, since I was not sure about the procedure, I had
>>> consulted Owen about it. He had indicated that voting is not
>>>necessary. If
>>> the right procedure is to call for voting, I will do so. Owen any
>>>comments?
>>>
>>>
>>>>
>>>> It feels like you are rushing this and are not doing what you would
>>>> expect
>>>> others to
>>>> do in the same position, and what has been done in the past for such
>>>> large
>>>> projects.
>>>>
>>>
>>> I am not trying to rush here and not follow the procedure required. I
>>>am
>>> not sure about what the procedure is. Any pointers to it is
>>>appreciated.
>>>
>>>
>>>>
>>>> Thanks,
>>>> --Konstantin
>>>>
>>>>
>>>> On Tue, Apr 26, 2011 at 9:43 PM, Doug Cutting <cutting@apache.org>
>>>> wrote:
>>>>
>>>> > Suresh, Sanjay,
>>>> >
>>>> > Thank you very much for addressing my questions.
>>>> >
>>>> > Cheers,
>>>> >
>>>> > Doug
>>>> >
>>>> > On 04/26/2011 10:29 AM, suresh srinivas wrote:
>>>> > > Doug,
>>>> > >
>>>> > >
>>>> > >> 1. Can you please describe the significant advantages this
>>>>approach
>>>> has
>>>> > >> over a symlink-based approach?
>>>> > >
>>>> > > Federation is complementary with symlink approach. You could
>>>>choose
>>>> to
>>>> > > provide integrated namespace using symlinks. However, client side
>>>> mount
>>>> > > tables seems a better approach for many reasons:
>>>> > > # Unlike symbolic links, client side mount tables can choose to
>>>>go to
>>>> > right
>>>> > > namenode based on configuration. This avoids unnecessary RPCs to
>>>>the
>>>> > > namenodes to discover the targer of symlink.
>>>> > > # The unavailability of a namenode where a symbolic link is
>>>> configured
>>>> > does
>>>> > > not affect reaching the symlink target.
>>>> > > # Symbolic links need not be configured on every namenode in the
>>>> cluster
>>>> > and
>>>> > > future changes to symlinks need not be propagated to multiple
>>>> namenodes.
>>>> > In
>>>> > > client side mount tables, this information is in a central
>>>> configuration.
>>>> > >
>>>> > > If a deployment still wants to use symbolic link, federation does
>>>>not
>>>> > > preclude it.
>>>> > >
>>>> > >> It seems to me that one could run multiple namenodes on separate
>>>> boxes
>>>> > > and run multile datanode processes per storage box
>>>> > >
>>>> > > There are several advantages to using a single datanode:
>>>> > > # When you have large number of namenodes (say 20), the cost of
>>>> running
>>>> > > separate datanodes in terms of process resources such as memory
is
>>>> huge.
>>>> > > # The disk i/o management and storage utilization using a single
>>>> datanode
>>>> > is
>>>> > > much better, as it has complete view the storage.
>>>> > > # In the approach you are proposing, you have several clusters
to
>>>> manage.
>>>> > > However with federation, all datanodes are in a single cluster;
>>>>with
>>>> > single
>>>> > > configuration and operationally easier to manage.
>>>> > >
>>>> > >> The patch modifies much of the logic of Hadoop's central
>>>>component,
>>>> upon
>>>> > > which the performance and reliability of most other components
of
>>>>the
>>>> > > ecosystem depend.
>>>> > > That is not true.
>>>> > >
>>>> > > # Namenode is mostly unchanged in this feature.
>>>> > > # Read/write pipelines are unchanged.
>>>> > > # The changes are mainly in datanode:
>>>> > > #* the storage, FSDataset, Directory and Disk scanners now have
>>>> another
>>>> > > level to incorporate block pool ID into the hierarchy. This is
>>>>not a
>>>> > > significant change that should cause performance or stability
>>>> concerns.
>>>> > > #* datanodes use a separate thread per NN, just like the existing
>>>> thread
>>>> > > that communicates with NN.
>>>> > >
>>>> > >> Can you please tell me how this has been tested beyond unit
>>>>tests?
>>>> > > As regards to testing, we have passed 600+ tests. In hadoop, these
>>>>  tests
>>>> > > are mostly integration tests and not pure unit tests.
>>>> > >
>>>> > > While these tests have been extensive, we have also been testing
>>>>this
>>>> > branch
>>>> > > for last 4 months, with QA validation that reflects our production
>>>> > > environment. We have found the system to be stable, performing
>>>>well
>>>> and
>>>> > have
>>>> > > not found any blockers with the branch so far.
>>>> > >
>>>> > > HDFS-1052 has been open more than a year now. I had also sent an
>>>> email
>>>> > about
>>>> > > this merge around 2 months ago. There are 90 subtasks that have
>>>>been
>>>> > worked
>>>> > > on last couple of months under HDFS-1052. Given that there was
>>>>enough
>>>> > time
>>>> > > to ask these questions, your email a day before I am planning to
>>>> merge
>>>> > the
>>>> > > branch into trunk seems late!
>>>> > >
>>>> >
>>>>
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Suresh
>>>
>>>
>>
>>
>> --
>> Regards,
>> Suresh
>>
>>
>
>
>-- 
>Regards,
>Suresh



Mime
View raw message