nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chance Callahan <chance1calla...@gmail.com>
Subject Re: FATAL fetcher.Fetcher: Fetcher: java.lang.NullPointerException
Date Wed, 20 Jul 2011 18:47:00 GMT
I am now having a new issue:

2011-07-20 18:45:54,480 INFO   server Copying
/user/hdfs/nutch-1.4.jar->/tmp/jobsub-0pXrwu/work/tmp.jar
2011-07-20 18:45:54,852 INFO   server all_clusters:
[<hadoop.job_tracker.LiveJobTracker object at 0x94201ec>,
<hadoop.fs.hadoopfs.HadoopFileSystem object at 0x92ab7ec>]
2011-07-20 18:45:54,852 INFO   server Starting
['/usr/lib/hadoop/bin/hadoop', '--config',
'/etc/alternatives/hadoop-0.20-conf/', 'jar', 'tmp.jar',
'org.apache.nutch.crawl.Injector', 'inject', '/user/hdfs/urls',
'/user/hdfs/crawl/crawldb', '-conf',
'/user/hdfs/conf/nutch-site.xml'].  (Env: {'HADOOP_CLASSPATH':
'/usr/share/hue/apps/jobsub/src/jobsub/../../java-lib/trace.jar:/usr/share/hue/desktop/libs/hadoop/src/hadoop/../../static-group-mapping/java-lib/static-group-mapping-1.2.0.jar',
'HUE_JOBTRACE_LOG': '/tmp/jobsub-0pXrwu/jobs', 'HUE_JOBSUB_USER':
'hdfs', 'HADOOP_OPTS':
'-javaagent:/usr/share/hue/ext/thirdparty/java/aspectj-1.6.5/aspectjweaver.jar
-Dhue.suffix=-via-hue -Duser.name=hdfs', 'HUE_JOBSUB_GROUPS': 'hdfs',
'HADOOP_HOME': '/usr/lib/hadoop'})
2011-07-20 18:45:54,852 INFO   server Running:
/usr/lib/hadoop/bin/hadoop --config
/etc/alternatives/hadoop-0.20-conf/ jar tmp.jar
org.apache.nutch.crawl.Injector inject /user/hdfs/urls
/user/hdfs/crawl/crawldb -conf /user/hdfs/conf/nutch-site.xml
11/07/20 14:46:02 INFO crawl.Injector: Injector: starting at 2011-07-20 14:46:02
11/07/20 14:46:02 INFO crawl.Injector: Injector: crawlDb: inject
11/07/20 14:46:02 INFO crawl.Injector: Injector: urlDir: /user/hdfs/urls
11/07/20 14:46:02 INFO crawl.Injector: Injector: Converting injected
urls to crawl db entries.
11/07/20 14:46:03 INFO security.UgiFixer: Hue UGI fixer aspect loaded.
11/07/20 14:46:08 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
11/07/20 14:46:08 INFO security.JobClientTracer: Hue job submission
aspect loaded.
11/07/20 14:46:08 INFO util.NativeCodeLoader: Loaded the native-hadoop library
11/07/20 14:46:08 INFO mapred.FileInputFormat: Total input paths to process : 1
11/07/20 14:46:10 INFO mapred.JobClient: Running job: job_local_0001
11/07/20 14:46:11 INFO mapred.JobClient:  map 0% reduce 0%
11/07/20 14:46:12 INFO mapred.MapTask: numReduceTasks: 1
11/07/20 14:46:12 INFO mapred.MapTask: io.sort.mb = 100
11/07/20 14:46:13 INFO mapred.MapTask: data buffer = 79691776/99614720
11/07/20 14:46:13 INFO mapred.MapTask: record buffer = 262144/327680
11/07/20 14:46:13 WARN plugin.PluginRepository: Plugins: directory not
found: plugins
11/07/20 14:46:13 INFO plugin.PluginRepository: Plugin Auto-activation
mode: [true]
11/07/20 14:46:13 INFO plugin.PluginRepository: Registered Plugins:
11/07/20 14:46:13 INFO plugin.PluginRepository: 	NONE
11/07/20 14:46:13 INFO plugin.PluginRepository: Registered Extension-Points:
11/07/20 14:46:13 INFO plugin.PluginRepository: 	NONE
11/07/20 14:46:13 WARN mapred.LocalJobRunner: job_local_0001
java.lang.RuntimeException: Error in configuring object
	at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:386)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:324)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210)
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
	... 5 more
Caused by: java.lang.RuntimeException: Error in configuring object
	at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
	at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
	... 10 more
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
	... 13 more
Caused by: java.lang.RuntimeException: x point
org.apache.nutch.net.URLNormalizer not found.
	at org.apache.nutch.net.URLNormalizers.<init>(URLNormalizers.java:122)
	at org.apache.nutch.crawl.Injector$InjectMapper.configure(Injector.java:70)
	... 18 more
11/07/20 14:46:14 INFO mapred.JobClient: Job complete: job_local_0001
11/07/20 14:46:14 INFO mapred.JobClient: Counters: 0
11/07/20 14:46:14 INFO mapred.JobClient: Job Failed: NA
11/07/20 14:46:14 FATAL crawl.Injector: Injector: java.io.IOException:
Job failed!
	at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1246)
	at org.apache.nutch.crawl.Injector.inject(Injector.java:217)
	at org.apache.nutch.crawl.Injector.run(Injector.java:248)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.nutch.crawl.Injector.main(Injector.java:238)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:186)

On Wed, Jul 20, 2011 at 3:41 AM, Julien Nioche
<lists.digitalpebble@gmail.com> wrote:
> This has been fixed recently. Checkout 1.4 from SVN, it lives in a separate
> branch and is NOT in the trunk
>
>
> On 20 July 2011 02:58, Chance Callahan <chance1callahan@gmail.com> wrote:
>
>> Whenever I start Nutch, I get the following error:
>>
>> 2011-07-20 01:40:49,744 INFO   server Copying
>> /user/hdfs/bin/nutch-1.2.jar->/tmp/jobsub-eAdLAn/work/tmp.jar
>> 2011-07-20 01:40:50,179 INFO   server all_clusters:
>> [<hadoop.job_tracker.LiveJobTracker object at 0x94237ec>,
>> <hadoop.fs.hadoopfs.HadoopFileSystem object at 0x92ab7ec>]
>> 2011-07-20 01:40:50,179 INFO   server Starting
>> ['/usr/lib/hadoop/bin/hadoop', '--config',
>> '/etc/alternatives/hadoop-0.20-conf/', 'jar', 'tmp.jar',
>> 'org.apache.nutch.fetcher.Fetcher', '-conf',
>> '/nutch-1.2/conf/nutch-site.xml',
>> '-Dplugin.folders=/nutch-1.2/plugins/',
>> '/nutch-1.2/urlsdir/seeds.txt', '-dir', 'crawldb/crawl'].  (Env:
>> {'HADOOP_CLASSPATH':
>>
>> '/usr/share/hue/apps/jobsub/src/jobsub/../../java-lib/trace.jar:/usr/share/hue/desktop/libs/hadoop/src/hadoop/../../static-group-mapping/java-lib/static-group-mapping-1.2.0.jar',
>> 'HUE_JOBTRACE_LOG': '/tmp/jobsub-eAdLAn/jobs', 'HUE_JOBSUB_USER':
>> 'hdfs', 'HADOOP_OPTS':
>>
>> '-javaagent:/usr/share/hue/ext/thirdparty/java/aspectj-1.6.5/aspectjweaver.jar
>> -Dhue.suffix=-via-hue -Duser.name=hdfs', 'HUE_JOBSUB_GROUPS': 'hdfs',
>> 'HADOOP_HOME': '/usr/lib/hadoop'})
>> 2011-07-20 01:40:50,179 INFO   server Running:
>> /usr/lib/hadoop/bin/hadoop --config
>> /etc/alternatives/hadoop-0.20-conf/ jar tmp.jar
>> org.apache.nutch.fetcher.Fetcher -conf /nutch-1.2/conf/nutch-site.xml
>> -Dplugin.folders=/nutch-1.2/plugins/ /nutch-1.2/urlsdir/seeds.txt -dir
>> crawldb/crawl
>> 11/07/19 21:40:57 WARN fetcher.Fetcher: Fetcher: Your
>> 'http.agent.name' value should be listed first in 'http.robots.agents'
>> property.
>> 11/07/19 21:40:57 INFO fetcher.Fetcher: Fetcher: starting at 2011-07-19
>> 21:40:57
>> 11/07/19 21:40:57 INFO fetcher.Fetcher: Fetcher: segment:
>> /nutch-1.2/urlsdir/seeds.txt
>> 11/07/19 21:40:58 INFO security.UgiFixer: Hue UGI fixer aspect loaded.
>> 11/07/19 21:41:03 INFO jvm.JvmMetrics: Initializing JVM Metrics with
>> processName=JobTracker, sessionId=
>> 11/07/19 21:41:03 INFO security.JobClientTracer: Hue job submission
>> aspect loaded.
>> 11/07/19 21:41:03 INFO util.NativeCodeLoader: Loaded the native-hadoop
>> library
>> 11/07/19 21:41:03 INFO mapred.JobClient: Cleaning up the staging area
>> file:/tmp/hadoop-hdfs/mapred/staging/hdfs1105342640/.staging/job_local_0001
>> 11/07/19 21:41:03 FATAL fetcher.Fetcher: Fetcher:
>> java.lang.NullPointerException
>>       at
>> org.apache.nutch.fetcher.FetcherOutputFormat.checkOutputSpecs(FetcherOutputFormat.java:49)
>>       at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:874)
>>       at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
>>       at java.security.AccessController.doPrivileged(Native Method)
>>       at javax.security.auth.Subject.doAs(Subject.java:396)
>>       at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>>       at
>> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
>>       at
>> org.apache.hadoop.mapred.JobClient.submitJobInternal_aroundBody0(JobClient.java:807)
>>       at
>> org.apache.hadoop.mapred.JobClient$AjcClosure1.run(JobClient.java:1)
>>       at
>> org.apache.hadoop.security.JobClientTrace.ajc$around$org_apache_hadoop_security_JobClientTrace$1$b9879daproceed(JobClientTrace.aj:1)
>>       at
>> org.apache.hadoop.security.JobClientTrace.ajc$around$org_apache_hadoop_security_JobClientTrace$1$b9879da(JobClientTrace.aj:33)
>>       at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:807)
>>       at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1242)
>>       at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:1107)
>>       at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:1145)
>>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>       at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:1116)
>>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>       at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>       at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>       at java.lang.reflect.Method.invoke(Method.java:597)
>>       at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
>>
>> Any ideas what it means and how to fix it?
>>
>> Thanks,
>> Chance Callahan
>> KD0MXN
>>
>
>
>
> --
> *
> *Open Source Solutions for Text Engineering
>
> http://digitalpebble.blogspot.com/
> http://www.digitalpebble.com
>

Mime
View raw message