hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Uygar BAYAR <uy...@beriltech.com>
Subject Re: java.lang.OutOfMemoryError: Requested array size exceeds VM limit
Date Thu, 04 Oct 2007 06:42:25 GMT

hi
 It's not a namenode, there is a single segment. Before parsing part fetch
reduce by 10 factor.
here is call stack and files to be parse sorry for long log

/user/nutch/sirketce/crawled/segments/20071002163239/content    <dir>
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00000
<dir>
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00000/data   
<r 3>   334429747
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00000/index  
<r 3>   14916
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00001
<dir>
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00001/data   
<r 3>   327920464
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00001/index  
<r 3>   14930
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00002
<dir>
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00002/data   
<r 3>   329962280
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00002/index  
<r 3>   14980
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00003
<dir>
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00003/data   
<r 3>   328364139
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00003/index  
<r 3>   14724
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00004
<dir>
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00004/data   
<r 3>   327625845
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00004/index  
<r 3>   14762
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00005
<dir>
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00005/data   
<r 3>   328455639
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00005/index  
<r 3>   14889
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00006
<dir>
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00006/data   
<r 3>   331291187
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00006/index  
<r 3>   14660
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00007
<dir>
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00007/data   
<r 3>   323871321
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00007/index  
<r 3>   14681
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00008
<dir>
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00008/data   
<r 3>   327993727
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00008/index  
<r 3>   14898
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00009
<dir>
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00009/data   
<r 3>   323695463
/user/nutch/sirketce/crawled/segments/20071002163239/content/part-00009/index  
<r 3>   14656
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch       
<dir>
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00000    
<dir>
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00000/data       
<r 3>   8797532
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00000/index      
<r 3>   14508
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00001    
<dir>
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00001/data       
<r 3>   8759847
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00001/index      
<r 3>   14527
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00002    
<dir>
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00002/data       
<r 3>   8766600
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00002/index      
<r 3>   14583
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00003    
<dir>
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00003/data       
<r 3>   8787659
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00003/index      
<r 3>   14313
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00004    
<dir>
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00004/data       
<r 3>   8740838
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00004/index      
<r 3>   14352
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00005    
<dir>
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00005/data       
<r 3>   8736991
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00005/index      
<r 3>   14476
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00006    
<dir>
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00006/data       
<r 3>   8672715
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00006/index      
<r 3>   14265
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00007    
<dir>
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00007/data       
<r 3>   8695395
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00007/index      
<r 3>   14301
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00008    
<dir>
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00008/data       
<r 3>   8737508
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00008/index      
<r 3>   14483
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00009    
<dir>
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00009/data       
<r 3>   8705316
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_fetch/part-00009/index      
<r 3>   14243
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_generate    
<dir>
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_generate/part-00000 
<r 3>   8396806
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_generate/part-00001 
<r 3>   14459204
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_generate/part-00002 
<r 3>   6889290
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_generate/part-00003 
<r 3>   5811612
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_generate/part-00004 
<r 3>   7906811
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_generate/part-00005 
<r 3>   6508687
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_generate/part-00006 
<r 3>   6424363
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_generate/part-00007 
<r 3>   5835119
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_generate/part-00008 
<r 3>   6605622
/user/nutch/sirketce/crawled/segments/20071002163239/crawl_generate/part-00009 
<r 3>   5693232
 

task_0002_m_000070_0: log4j:ERROR setFile(null,true) call failed.
task_0002_m_000070_0: java.io.FileNotFoundException:
/home/nutch/crawler1/logs (Is a directory)
task_0002_m_000070_0:   at java.io.FileOutputStream.openAppend(Native
Method)
task_0002_m_000070_0:   at
java.io.FileOutputStream.<init>(FileOutputStream.java:177)
task_0002_m_000070_0:   at
java.io.FileOutputStream.<init>(FileOutputStream.java:102)
task_0002_m_000070_0:   at
org.apache.log4j.FileAppender.setFile(FileAppender.java:289)
task_0002_m_000070_0:   at
org.apache.log4j.FileAppender.activateOptions(FileAppender.java:163)
task_0002_m_000070_0:   at
org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:215)
task_0002_m_000070_0:   at
org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:256)
task_0002_m_000070_0:   at
org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:132)
task_0002_m_000070_0:   at
org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:96)
task_0002_m_000070_0:   at
org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:654)
task_0002_m_000070_0:   at
org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:612)
task_0002_m_000070_0:   at
org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:509)
task_0002_m_000070_0:   at
org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:415)
task_0002_m_000070_0:   at
org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:441)
task_0002_m_000070_0:   at
org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:468)
task_0002_m_000070_0:   at
org.apache.log4j.LogManager.<clinit>(LogManager.java:122)
task_0002_m_000070_0:   at
org.apache.log4j.Logger.getLogger(Logger.java:104)
task_0002_m_000070_0:   at
org.apache.commons.logging.impl.Log4JLogger.getLogger(Log4JLogger.java:229)
task_0002_m_000070_0:   at
org.apache.commons.logging.impl.Log4JLogger.<init>(Log4JLogger.java:65)
task_0002_m_000070_0:   at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
task_0002_m_000070_0:   at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
task_0002_m_000070_0:   at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
task_0002_m_000070_0:   at
java.lang.reflect.Constructor.newInstance(Constructor.java:494)
task_0002_m_000070_0:   at
org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:529)
task_0002_m_000070_0:   at
org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:235)
task_0002_m_000070_0:   at
org.apache.commons.logging.LogFactory.getLog(LogFactory.java:370)
task_0002_m_000070_0:   at
org.apache.hadoop.mapred.TaskTracker.<clinit>(TaskTracker.java:84)
task_0002_m_000070_0:   at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1685)
task_0002_m_000070_0: log4j:ERROR Either File or DatePattern options are not
set for appender [DRFA].
task_0002_m_000070_1: log4j:ERROR setFile(null,true) call failed.
task_0002_m_000070_1: java.io.FileNotFoundException:
/home/nutch/crawler1/logs (Is a directory)
task_0002_m_000070_1:   at java.io.FileOutputStream.openAppend(Native
Method)
task_0002_m_000070_1:   at
java.io.FileOutputStream.<init>(FileOutputStream.java:177)
task_0002_m_000070_1:   at
java.io.FileOutputStream.<init>(FileOutputStream.java:102)
task_0002_m_000070_1:   at
org.apache.log4j.FileAppender.setFile(FileAppender.java:289)
task_0002_m_000070_1:   at
org.apache.log4j.FileAppender.activateOptions(FileAppender.java:163)
task_0002_m_000070_1:   at
org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:215)
task_0002_m_000070_1:   at
org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:256)
task_0002_m_000070_1:   at
org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:132)
task_0002_m_000070_1:   at
org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:96)
task_0002_m_000070_1:   at
org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:654)
task_0002_m_000070_1:   at
org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:612)
task_0002_m_000070_1:   at
org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:509)
task_0002_m_000070_1:   at
org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:415)
task_0002_m_000070_1:   at
org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:441)
task_0002_m_000070_1:   at
org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:468)
task_0002_m_000070_1:   at
org.apache.log4j.LogManager.<clinit>(LogManager.java:122)
task_0002_m_000070_1:   at
org.apache.log4j.Logger.getLogger(Logger.java:104)
task_0002_m_000070_1:   at
org.apache.commons.logging.impl.Log4JLogger.getLogger(Log4JLogger.java:229)
task_0002_m_000070_1:   at
org.apache.commons.logging.impl.Log4JLogger.<init>(Log4JLogger.java:65)
task_0002_m_000070_1:   at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
task_0002_m_000070_1:
atsun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
task_0002_m_000070_1:   at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
task_0002_m_000070_1:   at
java.lang.reflect.Constructor.newInstance(Constructor.java:494)
task_0002_m_000070_1:   at
org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:529)
task_0002_m_000070_1:   at
org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:235)
task_0002_m_000070_1:   at
org.apache.commons.logging.LogFactory.getLog(LogFactory.java:370)
task_0002_m_000070_1:   at
org.apache.hadoop.mapred.TaskTracker.<clinit>(TaskTracker.java:84)
task_0002_m_000070_1:   at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1685)
task_0002_m_000070_1: log4j:ERROR Either File or DatePattern options are not
set for appender [DRFA].
task_0002_m_000070_2: log4j:ERROR setFile(null,true) call failed.
task_0002_m_000070_2: java.io.FileNotFoundException:
/home/nutch/crawler1/logs (Is a directory)
task_0002_m_000070_2:   at java.io.FileOutputStream.openAppend(Native
Method)
task_0002_m_000070_2:   at
java.io.FileOutputStream.<init>(FileOutputStream.java:177)
task_0002_m_000070_2:   at
java.io.FileOutputStream.<init>(FileOutputStream.java:102)
task_0002_m_000070_2:   at
org.apache.log4j.FileAppender.setFile(FileAppender.java:289)
task_0002_m_000070_2:   at
org.apache.log4j.FileAppender.activateOptions(FileAppender.java:163)
task_0002_m_000070_2:   at
org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:215)
task_0002_m_000070_2:   at
org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:256)
task_0002_m_000070_2:   at
org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:132)
task_0002_m_000070_2:   at
org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:96)
task_0002_m_000070_2:   at
org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:654)
task_0002_m_000070_2:   at
org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:612)
task_0002_m_000070_2:   at
org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:509)
task_0002_m_000070_2:   at
org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:415)
task_0002_m_000070_2:   at
org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:441)
task_0002_m_000070_2:   at
org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:468)
task_0002_m_000070_2:   at
org.apache.log4j.LogManager.<clinit>(LogManager.java:122)
task_0002_m_000070_2:   at
org.apache.log4j.Logger.getLogger(Logger.java:104)
task_0002_m_000070_2:   at
org.apache.commons.logging.impl.Log4JLogger.getLogger(Log4JLogger.java:229)
task_0002_m_000070_2:   at
org.apache.commons.logging.impl.Log4JLogger.<init>(Log4JLogger.java:65)
task_0002_m_000070_2:   at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
task_0002_m_000070_2:   at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
task_0002_m_000070_2:   at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
task_0002_m_000070_2:   at
java.lang.reflect.Constructor.newInstance(Constructor.java:494)
task_0002_m_000070_2:   at
org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:529)
task_0002_m_000070_2:   at
org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:235)
task_0002_m_000070_2:   at
org.apache.commons.logging.LogFactory.getLog(LogFactory.java:370)
task_0002_m_000070_2:   at
org.apache.hadoop.mapred.TaskTracker.<clinit>(TaskTracker.java:84)
task_0002_m_000070_2:   at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1685)
task_0002_m_000070_2: log4j:ERROR Either File or DatePattern options are not
set for appender [DRFA].
Exception in thread "main" java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
        at org.apache.nutch.parse.ParseSegment.parse(ParseSegment.java:131)
        at org.apache.nutch.parse.ParseSegment.main(ParseSegment.java:149)


Konstantin Shvachko wrote:
> 
> Hi
> Could you also send a call stack. It is not clear which component is out 
> of memory.
> If it is the name-node, then you should check how many files, dirs, and 
> blocks there is by the time of failure.
> If your crawl generates a lot of small files that could be the case.
> Let us know.
> --Konstantin
> 
> 
> Uygar BAYAR wrote:
> 
>>hi 
>> we have 4 machine cluster. (dual core CPU 3.20GHz 2GB RAM 400GB disk).We
>>use nutch 0.9 and hadoop 0.13.1. We try to crawl web (60K site) 5 depth.
>>When we came 4th segment parse it gave  java.lang.OutOfMemoryError:
>>Requested array size exceeds VM limit error each machine.. Our segment
size 
>>crawled/segments/20071002163239        3472754178
>>i try several map reduce configurations nothing change.. (400-50 ; 300-15
>>;50-15 ; 100-15; 200-35)
>>i also set heap size in hadoop-env and nutch script to 2000M
>>
>> 
>>
>>  
>>
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/java.lang.OutOfMemoryError%3A-Requested-array-size-exceeds-VM-limit-tf4562352.html#a13033518
Sent from the Hadoop Users mailing list archive at Nabble.com.


Mime
View raw message