hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joey Echeverria <j...@cloudera.com>
Subject Re: pointing mapred.local.dir to a ramdisk
Date Mon, 03 Oct 2011 18:49:05 GMT
Raj,

I just tried this on my CHD3u1 VM, and the ramdisk worked the first
time. So, it's possible you've hit a bug in CDH3b3 that was later
fixed. Can you enable debug logging in log4j.properties and then
repost your task tracker log? I think there might be more details that
it will print that will be helpful.

-Joey

On Mon, Oct 3, 2011 at 2:18 PM, Raj V <rajvish@yahoo.com> wrote:
> Edward
>
> I understand the size limitations - but for my experiment the ramdisk size I have created
is large enough.
> I think there will be substantial benefits by putting the intermediate map outputs on
a ramdisk - size permitting, ofcourse, but I can't provide any numbers to substantiate my
claim  given that I can't get it to run.
>
> -best regards
>
> Raj
>
>
>
>>________________________________
>>From: Edward Capriolo <edlinuxguru@gmail.com>
>>To: common-user@hadoop.apache.org
>>Cc: Raj V <rajvish@yahoo.com>
>>Sent: Monday, October 3, 2011 10:36 AM
>>Subject: Re: pointing mapred.local.dir to a ramdisk
>>
>>This directory can get very large, in many cases I doubt it would fit on a
>>ram disk.
>>
>>Also RAM Disks tend to help most with random read/write, since hadoop is
>>doing mostly linear IO you may not see a great benefit from the RAM disk.
>>
>>
>>
>>On Mon, Oct 3, 2011 at 12:07 PM, Vinod Kumar Vavilapalli <
>>vinodkv@hortonworks.com> wrote:
>>
>>> Must be related to some kind of permissions problems.
>>>
>>> It will help if you can paste the corresponding source code for
>>> FileUtil.copy(). Hard to track it with different versions, so.
>>>
>>> Thanks,
>>> +Vinod
>>>
>>>
>>> On Mon, Oct 3, 2011 at 9:28 PM, Raj V <rajvish@yahoo.com> wrote:
>>>
>>> > Eric
>>> >
>>> > Yes. The owner is hdfs and group is hadoop and the directory is group
>>> > writable(775).  This is tehe exact same configuration I have when I use
>>> real
>>> > disks.But let me give it a try again to see if I overlooked something.
>>> > Thanks
>>> >
>>> > Raj
>>> >
>>> > >________________________________
>>> > >From: Eric Caspole <eric.caspole@amd.com>
>>> > >To: common-user@hadoop.apache.org
>>> > >Sent: Monday, October 3, 2011 8:44 AM
>>> > >Subject: Re: pointing mapred.local.dir to a ramdisk
>>> > >
>>> > >Are you sure you have chown'd/chmod'd the ramdisk directory to be
>>> > writeable by your hadoop user? I have played with this in the past and it
>>> > should basically work.
>>> > >
>>> > >
>>> > >On Oct 3, 2011, at 10:37 AM, Raj V wrote:
>>> > >
>>> > >> Sending it to the hadoop mailing list - I think this is a hadoop
>>> related
>>> > problem and not related to Cloudera distribution.
>>> > >>
>>> > >> Raj
>>> > >>
>>> > >>
>>> > >> ----- Forwarded Message -----
>>> > >>> From: Raj V <rajvish@yahoo.com>
>>> > >>> To: CDH Users <cdh-user@cloudera.org>
>>> > >>> Sent: Friday, September 30, 2011 5:21 PM
>>> > >>> Subject: pointing mapred.local.dir to a ramdisk
>>> > >>>
>>> > >>>
>>> > >>> Hi all
>>> > >>>
>>> > >>>
>>> > >>> I have been trying some experiments to improve performance.
One of
>>> the
>>> > experiments involved pointing mapred.local.dir to a RAM disk. To this end
>>> I
>>> > created a 128MB RAM disk ( each of my map outputs are smaller than this)
>>> but
>>> > I have not been able to get the task tracker to start.
>>> > >>>
>>> > >>>
>>> > >>> I am running CDH3B3 ( hadoop-0.20.2+737) and here the error
message
>>> > from the task tracker log.
>>> > >>>
>>> > >>>
>>> > >>> Tasktracker logs
>>> > >>>
>>> > >>>
>>> > >>> 2011-09-30 16:50:00,689 INFO org.mortbay.log: Logging to
>>> > org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>> > org.mortbay.log.Slf4jLog
>>> > >>> 2011-09-30 16:50:00,930 INFO org.apache.hadoop.http.HttpServer:
Added
>>> > global filtersafety
>>> > (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>> > >>> 2011-09-30 16:50:01,000 INFO org.apache.hadoop.http.HttpServer:
Port
>>> > returned by webServer.getConnectors()[0].getLocalPort() before open() is
>>> -1.
>>> > Opening the listener on 50060
>>> > >>> 2011-09-30 16:50:01,023 INFO org.apache.hadoop.http.HttpServer:
>>> > listener.getLocalPort() returned 50060
>>> > webServer.getConnectors()[0].getLocalPort() returned 50060
>>> > >>> 2011-09-30 16:50:01,024 INFO org.apache.hadoop.http.HttpServer:
Jetty
>>> > bound to port 50060
>>> > >>> 2011-09-30 16:50:01,024 INFO org.mortbay.log: jetty-6.1.14
>>> > >>> 2011-09-30 16:50:02,388 INFO org.mortbay.log: Started
>>> > SelectChannelConnector@0.0.0.0:50060
>>> > >>> 2011-09-30 16:50:02,400 INFO
>>> > org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
>>> > with mapRetainSize=-1 and reduceRetainSize=-1
>>> > >>> 2011-09-30 16:50:02,422 INFO org.apache.hadoop.mapred.TaskTracker:
>>> > Starting tasktracker with owner as mapred
>>> > >>> 2011-09-30 16:50:02,493 ERROR org.apache.hadoop.mapred.TaskTracker:
>>> Can
>>> > not start task tracker because java.lang.NullPointerException
>>> > >>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213)
>>> > >>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
>>> > >>>         at
>>> >
>>> org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:253)
>>> > >>>         at
>>> >
>>> org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:404)
>>> > >>>         at
>>> >
>>> org.apache.hadoop.util.MRAsyncDiskService.moveAndDeleteRelativePath(MRAsyncDiskService.java:255)
>>> > >>>         at
>>> >
>>> org.apache.hadoop.util.MRAsyncDiskService.cleanupAllVolumes(MRAsyncDiskService.java:311)
>>> > >>>         at
>>> > org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:618)
>>> > >>>         at
>>> > org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1351)
>>> > >>>         at
>>> > org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3504)
>>> > >>>
>>> > >>>
>>> > >>> 2011-09-30 16:50:02,497 INFO org.apache.hadoop.mapred.TaskTracker:
>>> > SHUTDOWN_MSG:
>>> > >>> /************************************************************
>>> > >>> SHUTDOWN_MSG: Shutting down TaskTracker at HADOOP52-4/10.52.1.5
>>> > >>>
>>> > >>>
>>> > >>> and here is my mapred-site.xml file
>>> > >>>
>>> > >>>
>>> > >>> <property>
>>> > >>>     <name>mapred.local.dir</name>
>>> > >>>     <value>/ramdisk1</value>
>>> > >>>   </property>
>>> > >>>
>>> > >>>
>>> > >>> If I have a regular directory on a regular drive such as below
- it
>>> > works. If I don't mount the ramdisk - it works.
>>> > >>>
>>> > >>>
>>> > >>> <property>
>>> > >>>     <name>mapred.local.dir</name>
>>> > >>>     <value>/hadoop-dsk0/local,/hadoop-dsk1/local</value>
>>> > >>>   </property>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>> The NullPointerException does not tell me what the error is
or how to
>>> > fix it.
>>> > >>>
>>> > >>>
>>> > >>> From the logs it looks like some disk based operation failed.
I can't
>>> > guess I must also confess that this is the first time I am using an ext2
>>> > file system.
>>> > >>>
>>> > >>>
>>> > >>> Any ideas?
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>> Raj
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >
>>> > >
>>> > >
>>> > >
>>> > >
>>> >
>>>
>>
>>
>>



-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434

Mime
View raw message