hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom White" <tom.e.wh...@gmail.com>
Subject Re: [VOTE] Should we create sub-projects for HDFS and Map/Reduce?
Date Fri, 08 Aug 2008 09:08:13 GMT
+1

I was initially concerned about the overhead of having to install
separate packages for each component, but in some ways it will make
things clearer. Folks on the user list are often asking how to use
HDFS by itself for instance - or even if it is possible. By splitting
it up it would make it clear that HDFS and MapReduce can be used
without the other (although of course, they are best used together).
Also, I can see some benefit from having separate configuration for
HDFS and MapReduce, since it will make the configuration files smaller
and more manageable (something like hdfs-(default|site).xml,
mapreduce-(default|site).xml).

It's not totally clear to me how Core fits into this. It's just a jar
file and doesn't have daemons, so it should be bundled with the
MapReduce and HDFS releases, shouldn't it?

Nigel Daley <ndaley@yahoo-inc.com> wrote:
> How will unit tests be divided?  For instance, will all three have to have
> MiniDFSCluster and other shared test infrastructure?

Today the tests for core, hdfs and mapred are under one source tree
because they are so tightly intertwined. I think the goal should be to
have independent unit tests for each module, as well as integration
tests that test that MapReduce works with HDFS. We should do this even
if we don't split the projects.

Tom

Mime
View raw message