nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Julien Nioche (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (NUTCH-937) When nutch is run on hadoop > 0.20.2 (or cdh) it will not find plugins because MapReduce will not unpack plugin/ directory from the job's pack (due to MAPREDUCE-967)
Date Thu, 25 Aug 2011 06:30:29 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090811#comment-13090811
] 

Julien Nioche commented on NUTCH-937:
-------------------------------------

Markus,

The param plugin.folder is multivalued so you can keep the existing value (plugins) as well
as add the new one :

<name>plugin.folders</name>
<value>${job.local.dir}/../jars/plugins,plugins</value>

I think it would be good to commit this (once tested of course) as it will make it easier
for people to use with CDH and other distribs

> When nutch is run on hadoop > 0.20.2 (or cdh) it will not find plugins because MapReduce
will not unpack plugin/ directory from the job's pack (due to MAPREDUCE-967)
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-937
>                 URL: https://issues.apache.org/jira/browse/NUTCH-937
>             Project: Nutch
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 1.2
>         Environment: hadoop 0.21 or cloudera hadoop 0.20.2+737
>            Reporter: Claudio Martella
>            Assignee: Markus Jelsma
>             Fix For: 1.4, 2.0
>
>
> Jobs running in on hadoop 0.21 or cloudera cdh 0.20.2+737 will fail because of missing
plugins (i.e.):
> 10/10/28 12:22:21 WARN mapred.JobClient: Use GenericOptionsParser for
> parsing the arguments. Applications should implement Tool for the same.
> 10/10/28 12:22:22 INFO mapred.FileInputFormat: Total input paths to
> process : 1
> 10/10/28 12:22:23 INFO mapred.JobClient: Running job: job_201010271826_0002
> 10/10/28 12:22:24 INFO mapred.JobClient:  map 0% reduce 0%
> 10/10/28 12:22:39 INFO mapred.JobClient: Task Id :
> attempt_201010271826_0002_m_000000_0, Status : FAILED
> java.lang.RuntimeException: Error in configuring object
>     at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>     at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>     at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>     at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:379)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:317)
>     at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:396)
>     at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
>     at org.apache.hadoop.mapred.Child.main(Child.java:211)
> Caused by: java.lang.reflect.InvocationTargetException
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>     ... 9 more
> Caused by: java.lang.RuntimeException: Error in configuring object
>     at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>     at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>     at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>     at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
>     ... 14 more
> Caused by: java.lang.reflect.InvocationTargetException
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>     ... 17 more
> Caused by: java.lang.RuntimeException: x point
> org.apache.nutch.net.URLNormalizer not found.
>     at org.apache.nutch.net.URLNormalizers.<init>(URLNormalizers.java:122)
>     at
> org.apache.nutch.crawl.Injector$InjectMapper.configure(Injector.java:70)
>     ... 22 more
> 10/10/28 12:22:40 INFO mapred.JobClient: Task Id :
> attempt_201010271826_0002_m_000001_0, Status : FAILED
> java.lang.RuntimeException: Error in configuring object
>     at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>     at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>     at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>     at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:379)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:317)
>     at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:396)
>     at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
>     at org.apache.hadoop.mapred.Child.main(Child.java:211)
> Caused by: java.lang.reflect.InvocationTargetException
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>     ... 9 more
> Caused by: java.lang.RuntimeException: Error in configuring object
>     at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>     at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>     at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>     at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
>     ... 14 more
> Caused by: java.lang.reflect.InvocationTargetException
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>     ... 17 more
> Caused by: java.lang.RuntimeException: x point
> org.apache.nutch.net.URLNormalizer not found.
>     at org.apache.nutch.net.URLNormalizers.<init>(URLNormalizers.java:122)
>     at
> org.apache.nutch.crawl.Injector$InjectMapper.configure(Injector.java:70)
>     ... 22 more
> The bug is due to MAPREDUCE-967 (part of hadoop 0.21 and cdh 0.20.2+737) which modifies
the way MapReduce unpacks the job's jar. The old way was to unpack the whole of it, now only
classes/ and lib/ are unpacked. This way nutch is missing the plugins/ directory.
> A workaround is to force unpacking of the plugin/ directory by setting 'mapreduce.job.jar.unpack.pattern'
configuration to "(?:classes/|lib/|plugins/).*"

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message