hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Häger <martin.ha...@byburt.com>
Subject Re: "Type mismatch in key from map: expected org.apache.hadoop.io.LongWritable" using MultipleInputs (multiple mappers)
Date Fri, 12 Feb 2010 10:13:30 GMT
We do not get the above error when running in pseudo-distributed mode.
Instead, we get "java.lang.RuntimeException: readObject can't find
class". Any ideas what might be wrong?

mtah@thinkpad:~$ hadoop jar /tmp/classify.jar Classify
10/02/12 11:09:48 WARN mapred.JobClient: No job jar file set.  User
classes may not be found. See JobConf(Class) or
JobConf#setJar(String).
10/02/12 11:09:48 INFO input.FileInputFormat: Total input paths to process : 1
10/02/12 11:09:48 INFO input.FileInputFormat: Total input paths to process : 1
10/02/12 11:09:49 INFO mapred.JobClient: Running job: job_201002121044_0009
10/02/12 11:09:50 INFO mapred.JobClient:  map 0% reduce 0%
10/02/12 11:10:01 INFO mapred.JobClient: Task Id :
attempt_201002121044_0009_m_000000_0, Status : FAILED
java.lang.RuntimeException: readObject can't find class
	at org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readClass(TaggedInputSplit.java:136)
	at org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readFields(TaggedInputSplit.java:121)
	at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
	at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:549)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
	at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.ClassNotFoundException: Classify$TransformationActionMapper
	at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:247)
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:761)
	at org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readClass(TaggedInputSplit.java:134)
	... 6 more

10/02/12 11:10:01 INFO mapred.JobClient: Task Id :
attempt_201002121044_0009_m_000001_0, Status : FAILED
java.lang.RuntimeException: readObject can't find class
	at org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readClass(TaggedInputSplit.java:136)
	at org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readFields(TaggedInputSplit.java:121)
	at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
	at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:549)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
	at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.ClassNotFoundException:
Classify$TransformationSessionMapper
	at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:247)
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:761)
	at org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readClass(TaggedInputSplit.java:134)
	... 6 more

10/02/12 11:10:07 INFO mapred.JobClient: Task Id :
attempt_201002121044_0009_m_000001_1, Status : FAILED
java.lang.RuntimeException: readObject can't find class
	at org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readClass(TaggedInputSplit.java:136)
	at org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readFields(TaggedInputSplit.java:121)
	at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
	at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:549)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
	at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.ClassNotFoundException:
Classify$TransformationSessionMapper
	at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:247)
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:761)
	at org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readClass(TaggedInputSplit.java:134)
	... 6 more

10/02/12 11:10:07 INFO mapred.JobClient: Task Id :
attempt_201002121044_0009_m_000000_1, Status : FAILED
java.lang.RuntimeException: readObject can't find class
	at org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readClass(TaggedInputSplit.java:136)
	at org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readFields(TaggedInputSplit.java:121)
	at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
	at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:549)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
	at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.ClassNotFoundException: Classify$TransformationActionMapper
	at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:247)
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:761)
	at org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readClass(TaggedInputSplit.java:134)
	... 6 more

10/02/12 11:10:13 INFO mapred.JobClient: Task Id :
attempt_201002121044_0009_m_000000_2, Status : FAILED
java.lang.RuntimeException: readObject can't find class
	at org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readClass(TaggedInputSplit.java:136)
	at org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readFields(TaggedInputSplit.java:121)
	at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
	at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:549)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
	at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.ClassNotFoundException: Classify$TransformationActionMapper
	at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:247)
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:761)
	at org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readClass(TaggedInputSplit.java:134)
	... 6 more

10/02/12 11:10:13 INFO mapred.JobClient: Task Id :
attempt_201002121044_0009_m_000001_2, Status : FAILED
java.lang.RuntimeException: readObject can't find class
	at org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readClass(TaggedInputSplit.java:136)
	at org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readFields(TaggedInputSplit.java:121)
	at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
	at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:549)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
	at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.ClassNotFoundException:
Classify$TransformationSessionMapper
	at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:247)
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:761)
	at org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readClass(TaggedInputSplit.java:134)
	... 6 more

10/02/12 11:10:22 INFO mapred.JobClient: Job complete: job_201002121044_0009
10/02/12 11:10:22 INFO mapred.JobClient: Counters: 3
10/02/12 11:10:22 INFO mapred.JobClient:   Job Counters
10/02/12 11:10:22 INFO mapred.JobClient:     Launched map tasks=8
10/02/12 11:10:22 INFO mapred.JobClient:     Data-local map tasks=8
10/02/12 11:10:22 INFO mapred.JobClient:     Failed map tasks=1


2010/2/11 Alex Kozlov <alexvk@cloudera.com>:
> Try job.setMapOutputKeyClass(JoinKey.class). -- Alex K
>
> On Thu, Feb 11, 2010 at 8:25 AM, E. Sammer <eric@lifeless.net> wrote:
>>
>> It looks like you're using the local job runner which does everything in a
>> single thread. In this case, yes, I think the mappers are run sequentially.
>> The local job runner is a different code path in Hadoop and is a known
>> issue. Have you tried your code in pseudo-distributed mode?
>>
>> HTH.
>>
>> On 2/11/10 11:14 AM, Martin Häger wrote:
>>>
>>> Hello,
>>>
>>> We're trying to do a reduce-side join by applying two different
>>> mappers (TransformationSessionMapper and TransformationActionMapper)
>>> to two different input files and joining them using
>>> TransformationReducer. See attached Classify.java for complete source.
>>>
>>> When running it, we get the following error. JoinKey is our own
>>> implementation that is used for performing secondary sort. Somehow
>>> TransformationActionMapper gets passed a JoinKey when it expects a
>>> LongWritable (TextInputFormat). Is Hadoop actually applying the
>>> mappers in sequence?
>>>
>>> $ hadoop jar /tmp/classify.jar Classify
>>> 10/02/11 16:40:16 INFO jvm.JvmMetrics: Initializing JVM Metrics with
>>> processName=JobTracker, sessionId=
>>> 10/02/11 16:40:16 WARN mapred.JobClient: No job jar file set.  User
>>> classes may not be found. See JobConf(Class) or
>>> JobConf#setJar(String).
>>> 10/02/11 16:40:16 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics
>>> with processName=JobTracker, sessionId= - already initialized
>>> 10/02/11 16:40:16 INFO input.FileInputFormat: Total input paths to
>>> process : 1
>>> 10/02/11 16:40:16 INFO input.FileInputFormat: Total input paths to
>>> process : 1
>>> 10/02/11 16:40:16 INFO mapred.JobClient: Running job: job_local_0001
>>> 10/02/11 16:40:16 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics
>>> with processName=JobTracker, sessionId= - already initialized
>>> 10/02/11 16:40:16 INFO input.FileInputFormat: Total input paths to
>>> process : 1
>>> 10/02/11 16:40:16 INFO input.FileInputFormat: Total input paths to
>>> process : 1
>>> 10/02/11 16:40:16 INFO mapred.MapTask: io.sort.mb = 100
>>> 10/02/11 16:40:16 INFO mapred.MapTask: data buffer = 79691776/99614720
>>> 10/02/11 16:40:16 INFO mapred.MapTask: record buffer = 262144/327680
>>> 10/02/11 16:40:16 WARN mapred.LocalJobRunner: job_local_0001
>>> java.io.IOException: Type mismatch in key from map: expected
>>> org.apache.hadoop.io.LongWritable, recieved Classify$JoinKey
>>>        at
>>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:807)
>>>        at
>>> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:504)
>>>        at
>>> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>>>        at Classify$TransformationActionMapper.map(Classify.java:161)
>>>        at Classify$TransformationActionMapper.map(Classify.java:1)
>>>        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>>        at
>>> org.apache.hadoop.mapreduce.lib.input.DelegatingMapper.run(DelegatingMapper.java:51)
>>>        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583)
>>>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>>>        at
>>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:176)
>>> 10/02/11 16:40:17 INFO mapred.JobClient:  map 0% reduce 0%
>>> 10/02/11 16:40:17 INFO mapred.JobClient: Job complete: job_local_0001
>>> 10/02/11 16:40:17 INFO mapred.JobClient: Counters: 0
>>
>>
>> --
>> Eric Sammer
>> eric@lifeless.net
>> http://esammer.blogspot.com
>
>

Mime
View raw message