hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: What OS?
Date Sun, 16 Aug 2009 15:48:08 GMT
On Sun, Aug 16, 2009 at 7:50 AM, Bogdan M.
Maryniuk<bogdan.maryniuk@gmail.com> wrote:
> On Sat, Aug 15, 2009 at 5:55 AM, Tom Wheeler<tomwheel@gmail.com> wrote:
>> I'd expect performance between either OS on the same hardware to be
>> pretty similar, but it's always hard to speculate on performance. The
>> best option would be for you to do a proof of concept with a couple of
>> machines so you can gauge what performance would be like based on the
>> actual jobs you'll be running.
> That's what I basically said before. :-)
> My few cents in this conversation: personally I go Solaris instead of
> Linux for other reasons. It is ZFS, self-healing, zones, better TCP/IP
> stack, better Sun Java, its overall stability etc. Performance is not
> primary point actually — I bet more on stability and manageability,
> which I find much more sophisticated on OpenSolaris, rather than on
> Linux (although OpenSolaris has lots of quite ugly things too)...
> Although, recent changes in OpenSolaris (e.g. new memory management)
> only proves more and more that my decision to drop Linux was damn
> right. :-)
> --
> Kind regards, BM
> Things, that are stupid at the beginning, rarely ends up wisely.

More two cents coming from me. Often picking the target platform of
the project is a safe bet. For example, say you desire to use the
fuse-dfs front end. Often times if you chose the same platform as the
majority of the community you can either find a binary package, or be
relatively confident that the install will go easy.

Now a quick retort to this that thinking is "Hadoop is open source it
should build on every platform". That thinking is true with a wrinkle
or two. Suppose you want to start using the fuse front end for the DFS
and your OS is say FreeBSD. You are entering uncharted waters, you
might hit some some minor incompatibility like something between make
and GMake, and you might have to start patching scripts, patching
code, or opening a Jira and asking for help it could be anywhere from
a quick fix to a tricky fix. Whereas someone who installed a more
tested platform had might have got it running out of the box and moved
onto bigger and better things like actually using fuse-dfs.

A quick example with this our cluster is Cent5. Someone hit me with a
requirement to be able to kick off jobs from a node running FreeBSD.
When i try to kick up a job using the compression libraries it failed,
most likely because I did had to use a ported/jvm that is not exactly
identical to the sun JVM or maybe something in the native libraries.
My quick fix was to turn off compression. I am probably the ONLY
person on the internet trying to do this. It could take hours/days of
research for me to figure out what is going on here. (I do have better
things to do)

So even though you can probably run a cluster with FreeBSD or Windows
ME you are definitely making more work for yourself and you are on an
island if you have an issue.

View raw message