hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Shvachko <shv.had...@gmail.com>
Subject Re: [Discuss] Merge federation branch HDFS-1052 into trunk
Date Thu, 28 Apr 2011 05:18:11 GMT
Suresh,
Showing no degradation in performance on one-node cluster is a good start
for benchmarking.
You still have a dev cluster to run benchmarks, don't you?
--Konstantin

On Wed, Apr 27, 2011 at 2:36 PM, suresh srinivas <srini30005@gmail.com>wrote:

> I ran these tests on my laptop. I would like to use this data to emphasize
> that there is no regression in performance. I am not sure with just the
> tests that I ran we could conclude there is a huge gain in performance with
> federation. When out performance test team runs tests at scale we will get
> more clearer picture.
>
>
>
> On Wed, Apr 27, 2011 at 10:41 AM, Konstantin Boudnik <cos@boudnik.org
> >wrote:
>
> > Interesting... while the read performance has only marginally improved
> > <4% (still a good thing) the write performance shows significantly
> > better improvements >10%. Very interesting asymmetry, indeed.
> >
> > Suresh, what was the size of the cluster in the testing?
> >  Cos
> >
> > On Wed, Apr 27, 2011 at 10:02, suresh srinivas <srini30005@gmail.com>
> > wrote:
> > > I posted the TestDFSIO comparison with and without federation to
> > HDFS-1052.
> > > Please let me know if it addresses your concern. I am also adding it
> > here:
> > >
> > > TestDFSIO read tests
> > > *Without federation:*
> > > ----- TestDFSIO ----- : read
> > >           Date & time: Wed Apr 27 02:04:24 PDT 2011
> > >       Number of files: 1000
> > > Total MBytes processed: 30000.0
> > >     Throughput mb/sec: 43.62329251162561
> > > Average IO rate mb/sec: 44.619869232177734
> > >  IO rate std deviation: 5.060306158158443
> > >    Test exec time sec: 959.943
> > >
> > > *With federation:*
> > > ----- TestDFSIO ----- : read
> > >           Date & time: Wed Apr 27 02:43:10 PDT 2011
> > >       Number of files: 1000
> > > Total MBytes processed: 30000.0
> > >     Throughput mb/sec: 45.657513857055456
> > > Average IO rate mb/sec: 46.72107696533203
> > >  IO rate std deviation: 5.455125923399539
> > >    Test exec time sec: 924.922
> > >
> > > TestDFSIO write tests
> > > *Without federation:*
> > > ----- TestDFSIO ----- : write
> > >           Date & time: Wed Apr 27 01:47:50 PDT 2011
> > >       Number of files: 1000
> > > Total MBytes processed: 30000.0
> > >     Throughput mb/sec: 35.940755259031015
> > > Average IO rate mb/sec: 38.236236572265625
> > >  IO rate std deviation: 5.929484960036511
> > >    Test exec time sec: 1266.624
> > >
> > > *With federation:*
> > > ----- TestDFSIO ----- : write
> > >           Date & time: Wed Apr 27 02:27:12 PDT 2011
> > >       Number of files: 1000
> > > Total MBytes processed: 30000.0
> > >     Throughput mb/sec: 42.17884674597227
> > > Average IO rate mb/sec: 43.11423873901367
> > >  IO rate std deviation: 5.357057259968647
> > >    Test exec time sec: 1135.298
> > > {noformat}
> > >
> > >
> > > On Tue, Apr 26, 2011 at 11:55 PM, suresh srinivas <
> srini30005@gmail.com
> > >wrote:
> > >
> > >> Konstantin,
> > >>
> > >> Could you provide me link to how this was done on a big feature, like
> > say
> > >> append and how benchmark info was captured? I am planning to run dfsio
> > >> tests, btw.
> > >>
> > >> Regards,
> > >> Suresh
> > >>
> > >>
> > >> On Tue, Apr 26, 2011 at 11:34 PM, suresh srinivas <
> srini30005@gmail.com
> > >wrote:
> > >>
> > >>> Konstantin,
> > >>>
> > >>> On Tue, Apr 26, 2011 at 10:26 PM, Konstantin Shvachko <
> > >>> shv.hadoop@gmail.com> wrote:
> > >>>
> > >>>> Suresh, Sanjay.
> > >>>>
> > >>>> 1. I asked for benchmarks many times over the course of different
> > >>>> discussions on the topic.
> > >>>> I don't see any numbers attached to jira, and I was getting the
same
> > >>>> response,
> > >>>> Doug just got from you, guys: which is "why would the performance
be
> > >>>> worse".
> > >>>> And this is not an argument for me.
> > >>>>
> > >>>
> > >>> We had done testing earlier and had found that performance had not
> > >>> degraded. We are waiting for out performance team to publish the
> > official
> > >>> numbers to post it to the jira. Unfortunately they are busy
> qualifying
> > 2xx
> > >>> releases currently. I will get the perf numbers and post them.
> > >>>
> > >>>
> > >>>>
> > >>>> 2. I assume that merging requires a vote. I am sure people who
know
> > >>>> bylaws
> > >>>> better than I do will correct me if it is not true.
> > >>>> Did I miss the vote?
> > >>>>
> > >>>
> > >>>
> > >>> As regards to voting, since I was not sure about the procedure, I had
> > >>> consulted Owen about it. He had indicated that voting is not
> necessary.
> > If
> > >>> the right procedure is to call for voting, I will do so. Owen any
> > comments?
> > >>>
> > >>>
> > >>>>
> > >>>> It feels like you are rushing this and are not doing what you would
> > >>>> expect
> > >>>> others to
> > >>>> do in the same position, and what has been done in the past for
such
> > >>>> large
> > >>>> projects.
> > >>>>
> > >>>
> > >>> I am not trying to rush here and not follow the procedure required.
I
> > am
> > >>> not sure about what the procedure is. Any pointers to it is
> > appreciated.
> > >>>
> > >>>
> > >>>>
> > >>>> Thanks,
> > >>>> --Konstantin
> > >>>>
> > >>>>
> > >>>> On Tue, Apr 26, 2011 at 9:43 PM, Doug Cutting <cutting@apache.org>
> > >>>> wrote:
> > >>>>
> > >>>> > Suresh, Sanjay,
> > >>>> >
> > >>>> > Thank you very much for addressing my questions.
> > >>>> >
> > >>>> > Cheers,
> > >>>> >
> > >>>> > Doug
> > >>>> >
> > >>>> > On 04/26/2011 10:29 AM, suresh srinivas wrote:
> > >>>> > > Doug,
> > >>>> > >
> > >>>> > >
> > >>>> > >> 1. Can you please describe the significant advantages
this
> > approach
> > >>>> has
> > >>>> > >> over a symlink-based approach?
> > >>>> > >
> > >>>> > > Federation is complementary with symlink approach. You
could
> > choose
> > >>>> to
> > >>>> > > provide integrated namespace using symlinks. However,
client
> side
> > >>>> mount
> > >>>> > > tables seems a better approach for many reasons:
> > >>>> > > # Unlike symbolic links, client side mount tables can
choose to
> go
> > to
> > >>>> > right
> > >>>> > > namenode based on configuration. This avoids unnecessary
RPCs to
> > the
> > >>>> > > namenodes to discover the targer of symlink.
> > >>>> > > # The unavailability of a namenode where a symbolic link
is
> > >>>> configured
> > >>>> > does
> > >>>> > > not affect reaching the symlink target.
> > >>>> > > # Symbolic links need not be configured on every namenode
in the
> > >>>> cluster
> > >>>> > and
> > >>>> > > future changes to symlinks need not be propagated to
multiple
> > >>>> namenodes.
> > >>>> > In
> > >>>> > > client side mount tables, this information is in a central
> > >>>> configuration.
> > >>>> > >
> > >>>> > > If a deployment still wants to use symbolic link, federation
> does
> > not
> > >>>> > > preclude it.
> > >>>> > >
> > >>>> > >> It seems to me that one could run multiple namenodes
on
> separate
> > >>>> boxes
> > >>>> > > and run multile datanode processes per storage box
> > >>>> > >
> > >>>> > > There are several advantages to using a single datanode:
> > >>>> > > # When you have large number of namenodes (say 20), the
cost of
> > >>>> running
> > >>>> > > separate datanodes in terms of process resources such
as memory
> is
> > >>>> huge.
> > >>>> > > # The disk i/o management and storage utilization using
a single
> > >>>> datanode
> > >>>> > is
> > >>>> > > much better, as it has complete view the storage.
> > >>>> > > # In the approach you are proposing, you have several
clusters
> to
> > >>>> manage.
> > >>>> > > However with federation, all datanodes are in a single
cluster;
> > with
> > >>>> > single
> > >>>> > > configuration and operationally easier to manage.
> > >>>> > >
> > >>>> > >> The patch modifies much of the logic of Hadoop's
central
> > component,
> > >>>> upon
> > >>>> > > which the performance and reliability of most other components
> of
> > the
> > >>>> > > ecosystem depend.
> > >>>> > > That is not true.
> > >>>> > >
> > >>>> > > # Namenode is mostly unchanged in this feature.
> > >>>> > > # Read/write pipelines are unchanged.
> > >>>> > > # The changes are mainly in datanode:
> > >>>> > > #* the storage, FSDataset, Directory and Disk scanners
now have
> > >>>> another
> > >>>> > > level to incorporate block pool ID into the hierarchy.
This is
> not
> > a
> > >>>> > > significant change that should cause performance or stability
> > >>>> concerns.
> > >>>> > > #* datanodes use a separate thread per NN, just like
the
> existing
> > >>>> thread
> > >>>> > > that communicates with NN.
> > >>>> > >
> > >>>> > >> Can you please tell me how this has been tested beyond
unit
> > tests?
> > >>>> > > As regards to testing, we have passed 600+ tests. In
hadoop,
> these
> > >>>>  tests
> > >>>> > > are mostly integration tests and not pure unit tests.
> > >>>> > >
> > >>>> > > While these tests have been extensive, we have also been
testing
> > this
> > >>>> > branch
> > >>>> > > for last 4 months, with QA validation that reflects our
> production
> > >>>> > > environment. We have found the system to be stable, performing
> > well
> > >>>> and
> > >>>> > have
> > >>>> > > not found any blockers with the branch so far.
> > >>>> > >
> > >>>> > > HDFS-1052 has been open more than a year now. I had also
sent an
> > >>>> email
> > >>>> > about
> > >>>> > > this merge around 2 months ago. There are 90 subtasks
that have
> > been
> > >>>> > worked
> > >>>> > > on last couple of months under HDFS-1052. Given that
there was
> > enough
> > >>>> > time
> > >>>> > > to ask these questions, your email a day before I am
planning to
> > >>>> merge
> > >>>> > the
> > >>>> > > branch into trunk seems late!
> > >>>> > >
> > >>>> >
> > >>>>
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>> Regards,
> > >>> Suresh
> > >>>
> > >>>
> > >>
> > >>
> > >> --
> > >> Regards,
> > >> Suresh
> > >>
> > >>
> > >
> > >
> > > --
> > > Regards,
> > > Suresh
> > >
> >
>
>
>
> --
> Regards,
> Suresh
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message