hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From shashwat shriparv <dwivedishash...@gmail.com>
Subject Re: hbase map reduce is talking lot of time
Date Tue, 17 Jul 2012 18:48:50 GMT
Hi Syed,

The problem is with the disk space. as map-reduce keeps the intermediate
result on the local disk, just check if you have enough disk space. and
also make sure that you have cleared the tmp directory and its writable.
Just provide more space and try else try with small number of users and
check if its working

Regards

∞
Shashwat Shriparv



On Tue, Jul 17, 2012 at 11:50 AM, syed kather <in.abdul@gmail.com> wrote:

> Team ,
>      i had wrote a mapreduce program . scenario of my program is to emit
> <userid,seqid>  .
>
>    Total no user : 825
>    Total no seqid:6583100
>
>    No of map which the program will emit is : 825 * 6583100
>
>   I have Hbase table called ObjectSequence : which consist of 6583100(rows)
>
> i had use TableMapper and TableReducer for my map reduce program
>
>
>  Problem definition :
>
> Processor : i7
> Replication Factor : 1
> Live Datanodes : 3
>
>  Node Last
> Contact  Admin State Configured
> Capacity (GB)  Used
> (GB) Non DFS
> Used (GB) Remaining
> (GB)  Used
> (%) Used
> (%)  Remaining
> (%) Blocks chethan 1In Service 28.590.625.172.82 2.11
>
> 9.8773 shashwat<
> http://shashwat:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F>
> 2In Service28.980.87 22.016.13
>
> 21.0469 syed<
> http://syed:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F>
> 0In Service28.984.29 18.376.3214.8
>
> 21.82129
> When i run balancer in hadoop i had seen Blocks are not equally distributed
> . Can i know what may be the reason for this ..
>
>
> Kind% CompleteNum TasksPendingRunningComplete KilledFailed/Killed
> Task Attempts<
> http://chethan:50030/jobfailures.jsp?jobid=job_201207121836_0007>
> map<
> http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=map&pagenum=1
> >
> 85.71%
>
> 701<
> http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=map&pagenum=1&state=running
> >
> 6<
> http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=map&pagenum=1&state=completed
> >
> 0 3<
> http://chethan:50030/jobfailures.jsp?jobid=job_201207121836_0007&kind=map&cause=failed
> >/
> 1<
> http://chethan:50030/jobfailures.jsp?jobid=job_201207121836_0007&kind=map&cause=killed
> >
> reduce<
> http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=reduce&pagenum=1
> >
> 28.57%
>
> 101<
> http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=reduce&pagenum=1&state=running
> >
> 000 / 0
> i had seen only Number Task is allocated is 8 . Is there any possibility to
> increase the Map Number of Task
>
> Completed TasksTaskCompleteStatusStart TimeFinish TimeErrorsCounters
> task_201207121836_0007_m_000001<
> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000001
> >
> 100.00%
> UserID: 777 SEQID:415794
> 12-Jul-2012 21:35:48
> 12-Jul-2012 21:36:12 (24sec)
>
> 16<
> http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000001>
> task_201207121836_0007_m_000002<
> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000002
> >
> 100.00%
> UserID: 777 SEQID:422256
> 12-Jul-2012 21:35:50
> 12-Jul-2012 21:36:47 (57sec)
>
> 16<
> http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000002>
> task_201207121836_0007_m_000003<
> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000003
> >
> 100.00%
> UserID: 777 SEQID:563544
> 12-Jul-2012 21:35:50
> 12-Jul-2012 22:00:08 (24mins, 17sec)
>
> 16<
> http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000003>
> task_201207121836_0007_m_000004<
> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000004
> >
> 100.00%
> UserID: 777 SEQID:592918
> 12-Jul-2012 21:35:50
> 12-Jul-2012 21:42:09 (6mins, 18sec)
>
> 16<
> http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000004>
> task_201207121836_0007_m_000005<
> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000005
> >
> 100.00%
> UserID: 777 SEQID:618121
> 12-Jul-2012 21:35:50
> 12-Jul-2012 21:44:34 (8mins, 43sec)
>
> 16<
> http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000005>
> task_201207121836_0007_m_000006<
> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000006
> >
> 100.00%
> UserID: 777 SEQID:685810
> 12-Jul-2012 21:36:12
> 12-Jul-2012 21:44:18 (8mins, 6sec)
>
> 16<
> http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000006>
> why for last Map task is talking nearly 2 hours .please give me some
> suggestion how to do an optimization
>
> TaskCompleteStatusStart Time Finish TimeErrorsCounters
> task_201207121836_0007_m_000000<
> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000000
> >
> 0.00%
> UserID: 482 SEQID:99596
> 12-Jul-2012 21:35:48
>
> java.io.IOException: Spill failed
>         at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1029)
>         at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691)
>         at
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>         at
> org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:205)
>         at
> org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:1)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:416)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException:
> Could not find any valid local directory for output/spill712.out
>         at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
>         at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
>         at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
>         at
> org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
>         at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392)
>         at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:853)
>         at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1344)
>
> java.lang.RuntimeException: Error while running command to get file
> permissions : java.io.IOException: Cannot run program "/bin/ls":
> java.io.IOException: error=12, Cannot allocate memory
>         at java.lang.ProcessBuilder.start(ProcessBuilder.java:475)
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:200)
>         at org.apache.hadoop.util.Shell.run(Shell.java:182)
>         at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
>         at org.apache.hadoop.util.Shell.execCommand(Shell.java:461)
>         at org.apache.hadoop.util.Shell.execCommand(Shell.java:444)
>         at org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:703)
>         at
> org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:443)
>         at
> org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:426)
>         at
> org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:251)
>         at
> org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:260)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:416)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.io.IOException: java.io.IOException: error=12, Cannot
> allocate memory
>         at java.lang.UNIXProcess.<init>(UNIXProcess.java:164)
>         at java.lang.ProcessImpl.start(ProcessImpl.java:81)
>         at java.lang.ProcessBuilder.start(ProcessBuilder.java:468)
>         ... 15 more
>
>         at
> org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:468)
>         at
> org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:426)
>         at
> org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:251)
>         at
> org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:260)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:416)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
> java.io.IOException: Spill failed
>         at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1029)
>         at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691)
>         at
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>         at
> org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:205)
>         at
> org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:1)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:416)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException:
> Could not find any valid local directory for output/spill934.out
>         at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
>         at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
>         at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
>         at
> org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
>         at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392)
>         at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:853)
>         at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1344)
>
>
>
> I had seen this error for last task what may be the reason for this error .
>
> NOTE:  When i run import hbase table it takes 10 min .
>
>
> Team please give suggestion what to be done to solve these issue .
>
>
>             Thanks and Regards,
>         S SYED ABDUL KATHER
>



-- 


∞
Shashwat Shriparv

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message