hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: HDFS Corruption: How to Troubleshoot or Determine Root Cause?
Date Thu, 19 May 2011 00:08:16 GMT
On Wed, May 18, 2011 at 4:55 PM, Aaron Eng <aeng@maprtech.com> wrote:
>>Most of the contributors are big picture types who would look at "small"
>> usability issues like this and scoff about "newbies".
> P.S. This is speaking from the newbie perspective, it was not meant as a
> slight to contributors in any way.  Just a comment on the steep learning
> curve of picking up Hadoop.

Hi Aaron,

I'm sorry you feel this way about the Hadoop contributors. It's
definitely a mistake we've made in the past but are trying to do our
best to improve things. The last two Wednesdays we have held
hackathons at the Cloudera offices and gotten lots of new people on
board working mostly on small fixes like this.

If you have some specific issues you'd like to point out, please file
JIRAs. I'll be sure to take a look.

-Todd

>
>
> On Wed, May 18, 2011 at 4:54 PM, Aaron Eng <aeng@maprtech.com> wrote:
>>
>> Hey Tim,
>>
>> Hope everything is good with you.  Looks like you're having some fun with
>> hadoop.
>>
>> >Can anyone enlighten me? Why is dfs.*.dir default to /tmp a good idea?
>> It's not a good idea, its just how it defaults.  You'll find hundreds or
>> probably thousands of these quirks as you work with Apache/Cloudera hadoop
>> distributions.  Never trust the defaults.
>>
>> > submitted a JIRA
>> That's the way to do it.
>>
>> >which appears to have been resolved ... but it does feel somewhat
>> > dissatisfying, since by the time you see the WARNING your cluster is already
>> > useless/dead.
>> And that's why, if it's relevant to you, you're best bet is to resolve the
>> JIRA yourself.  Most of the contributors are big picture types who would
>> look at "small" usability issues like this and scoff about "newbies".  Of
>> course, by the time you're familiar enough with Hadoop and comfortable
>> enough to fix your own JIRA's, you might also join the ranks of jaded
>> contributor who scoffs ad usability issues logged by newbies.
>>
>> Case in point, I noted a while ago that when you run the namenode -format
>> command, it only accepts a capital Y (or lower case, can't remember), and it
>> fails silently if you give the wrong case.  I didn't particularly care
>> enough to fix it, having already learned my lesson.  You'll find lots of
>> these rough edges through hadoop, it is not a user firendly, out-of-the-box
>> enterprise-ready product.
>>
>>
>>
>> On Wed, May 18, 2011 at 4:41 PM, Time Less <timelessness@gmail.com> wrote:
>>>
>>> Can anyone enlighten me? Why is dfs.*.dir default to /tmp a good idea?
>>> I'd rather, in order of preference, have the following behaviours if
>>> dfs.*.dir are undefined:
>>>
>>> Daemons log errors and fail to start at all,
>>> Daemons start but default to /var/db/hadoop (or any persistent location),
>>> meanwhile logging in huge screaming all-caps letters that it's picked a
>>> default which may not be optimal,
>>> Daemons start botnet and DDOS random government websites, wait 36 hours,
>>> then phone the FBI and blame administrator for it*,
>>> Daemons write "persistent" data into /tmp without any great fanfare,
>>> allowing a sense of complacency in its victims, only to report at a random
>>> time in the future that everything is corrupted beyond repair, ie current
>>> behaviour.
>>>
>>> I submitted a JIRA (which appears to have been resolved, yay!) to at
>>> least add verbiage to the WARNING letting you know why you've irreversibly
>>> corrupted your cluster, but it does feel somewhat dissatisfying, since by
>>> the time you see the WARNING your cluster is already useless/dead.
>>>
>>>> It's not quite what you're asking for, but your NameNode's web interface
>>>> should
>>>> provide a merged dump of all the relevant config settings, including
>>>> comments
>>>> indicating the name of the config file where the setting was defined, at
>>>> the
>>>> /conf path.
>>>
>>> Cool, though it looks like that's just the NameNode's config, right? Not
>>> the DataNode's config, which is the component corrupting data due to this
>>> default?
>>>
>>> --
>>> Tim Ellis
>>> Riot Games
>>> * Hello, FBI, #3 was a joke. I wish #4 was a joke, too.
>>>
>>
>
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Mime
View raw message