hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raj V <rajv...@yahoo.com>
Subject Re: pointing mapred.local.dir to a ramdisk
Date Mon, 03 Oct 2011 18:18:07 GMT
Edward

I understand the size limitations - but for my experiment the ramdisk size I have created
is large enough. 
I think there will be substantial benefits by putting the intermediate map outputs on a ramdisk
- size permitting, ofcourse, but I can't provide any numbers to substantiate my claim  given
that I can't get it to run.

-best regards

Raj



>________________________________
>From: Edward Capriolo <edlinuxguru@gmail.com>
>To: common-user@hadoop.apache.org
>Cc: Raj V <rajvish@yahoo.com>
>Sent: Monday, October 3, 2011 10:36 AM
>Subject: Re: pointing mapred.local.dir to a ramdisk
>
>This directory can get very large, in many cases I doubt it would fit on a
>ram disk.
>
>Also RAM Disks tend to help most with random read/write, since hadoop is
>doing mostly linear IO you may not see a great benefit from the RAM disk.
>
>
>
>On Mon, Oct 3, 2011 at 12:07 PM, Vinod Kumar Vavilapalli <
>vinodkv@hortonworks.com> wrote:
>
>> Must be related to some kind of permissions problems.
>>
>> It will help if you can paste the corresponding source code for
>> FileUtil.copy(). Hard to track it with different versions, so.
>>
>> Thanks,
>> +Vinod
>>
>>
>> On Mon, Oct 3, 2011 at 9:28 PM, Raj V <rajvish@yahoo.com> wrote:
>>
>> > Eric
>> >
>> > Yes. The owner is hdfs and group is hadoop and the directory is group
>> > writable(775).  This is tehe exact same configuration I have when I use
>> real
>> > disks.But let me give it a try again to see if I overlooked something.
>> > Thanks
>> >
>> > Raj
>> >
>> > >________________________________
>> > >From: Eric Caspole <eric.caspole@amd.com>
>> > >To: common-user@hadoop.apache.org
>> > >Sent: Monday, October 3, 2011 8:44 AM
>> > >Subject: Re: pointing mapred.local.dir to a ramdisk
>> > >
>> > >Are you sure you have chown'd/chmod'd the ramdisk directory to be
>> > writeable by your hadoop user? I have played with this in the past and it
>> > should basically work.
>> > >
>> > >
>> > >On Oct 3, 2011, at 10:37 AM, Raj V wrote:
>> > >
>> > >> Sending it to the hadoop mailing list - I think this is a hadoop
>> related
>> > problem and not related to Cloudera distribution.
>> > >>
>> > >> Raj
>> > >>
>> > >>
>> > >> ----- Forwarded Message -----
>> > >>> From: Raj V <rajvish@yahoo.com>
>> > >>> To: CDH Users <cdh-user@cloudera.org>
>> > >>> Sent: Friday, September 30, 2011 5:21 PM
>> > >>> Subject: pointing mapred.local.dir to a ramdisk
>> > >>>
>> > >>>
>> > >>> Hi all
>> > >>>
>> > >>>
>> > >>> I have been trying some experiments to improve performance. One
of
>> the
>> > experiments involved pointing mapred.local.dir to a RAM disk. To this end
>> I
>> > created a 128MB RAM disk ( each of my map outputs are smaller than this)
>> but
>> > I have not been able to get the task tracker to start.
>> > >>>
>> > >>>
>> > >>> I am running CDH3B3 ( hadoop-0.20.2+737) and here the error message
>> > from the task tracker log.
>> > >>>
>> > >>>
>> > >>> Tasktracker logs
>> > >>>
>> > >>>
>> > >>> 2011-09-30 16:50:00,689 INFO org.mortbay.log: Logging to
>> > org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>> > org.mortbay.log.Slf4jLog
>> > >>> 2011-09-30 16:50:00,930 INFO org.apache.hadoop.http.HttpServer:
Added
>> > global filtersafety
>> > (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>> > >>> 2011-09-30 16:50:01,000 INFO org.apache.hadoop.http.HttpServer:
Port
>> > returned by webServer.getConnectors()[0].getLocalPort() before open() is
>> -1.
>> > Opening the listener on 50060
>> > >>> 2011-09-30 16:50:01,023 INFO org.apache.hadoop.http.HttpServer:
>> > listener.getLocalPort() returned 50060
>> > webServer.getConnectors()[0].getLocalPort() returned 50060
>> > >>> 2011-09-30 16:50:01,024 INFO org.apache.hadoop.http.HttpServer:
Jetty
>> > bound to port 50060
>> > >>> 2011-09-30 16:50:01,024 INFO org.mortbay.log: jetty-6.1.14
>> > >>> 2011-09-30 16:50:02,388 INFO org.mortbay.log: Started
>> > SelectChannelConnector@0.0.0.0:50060
>> > >>> 2011-09-30 16:50:02,400 INFO
>> > org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
>> > with mapRetainSize=-1 and reduceRetainSize=-1
>> > >>> 2011-09-30 16:50:02,422 INFO org.apache.hadoop.mapred.TaskTracker:
>> > Starting tasktracker with owner as mapred
>> > >>> 2011-09-30 16:50:02,493 ERROR org.apache.hadoop.mapred.TaskTracker:
>> Can
>> > not start task tracker because java.lang.NullPointerException
>> > >>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213)
>> > >>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
>> > >>>         at
>> >
>> org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:253)
>> > >>>         at
>> >
>> org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:404)
>> > >>>         at
>> >
>> org.apache.hadoop.util.MRAsyncDiskService.moveAndDeleteRelativePath(MRAsyncDiskService.java:255)
>> > >>>         at
>> >
>> org.apache.hadoop.util.MRAsyncDiskService.cleanupAllVolumes(MRAsyncDiskService.java:311)
>> > >>>         at
>> > org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:618)
>> > >>>         at
>> > org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1351)
>> > >>>         at
>> > org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3504)
>> > >>>
>> > >>>
>> > >>> 2011-09-30 16:50:02,497 INFO org.apache.hadoop.mapred.TaskTracker:
>> > SHUTDOWN_MSG:
>> > >>> /************************************************************
>> > >>> SHUTDOWN_MSG: Shutting down TaskTracker at HADOOP52-4/10.52.1.5
>> > >>>
>> > >>>
>> > >>> and here is my mapred-site.xml file
>> > >>>
>> > >>>
>> > >>> <property>
>> > >>>     <name>mapred.local.dir</name>
>> > >>>     <value>/ramdisk1</value>
>> > >>>   </property>
>> > >>>
>> > >>>
>> > >>> If I have a regular directory on a regular drive such as below
- it
>> > works. If I don't mount the ramdisk - it works.
>> > >>>
>> > >>>
>> > >>> <property>
>> > >>>     <name>mapred.local.dir</name>
>> > >>>     <value>/hadoop-dsk0/local,/hadoop-dsk1/local</value>
>> > >>>   </property>
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >>> The NullPointerException does not tell me what the error is or
how to
>> > fix it.
>> > >>>
>> > >>>
>> > >>> From the logs it looks like some disk based operation failed. I
can't
>> > guess I must also confess that this is the first time I am using an ext2
>> > file system.
>> > >>>
>> > >>>
>> > >>> Any ideas?
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >>> Raj
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >
>> > >
>> > >
>> > >
>> > >
>> >
>>
>
>
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message