avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ey-chih chow (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AVRO-792) map reduce job for avro 1.5 generates ArrayIndexOutOfBoundsException
Date Wed, 13 Apr 2011 21:10:06 GMT

    [ https://issues.apache.org/jira/browse/AVRO-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019552#comment-13019552
] 

ey-chih chow commented on AVRO-792:
-----------------------------------

Thanks.  I tested the third patch under our environment.  Unfortunately, this did not fix
the problem.  What follows is the trace from our VM.

===============================================================================================================================

cloudera@cloudera-demo:~/src/ngpipes-etl/dist$ hadoop jar ngpipesjobs.jar com.ngmoco.ngpipes.etl.NgEventETLJob
input/etl/test_avro_bugfix/2011-04-12/0200 etl_out avro/ngpipes-events.avdl
Input Path => input/etl/test_avro_bugfix/2011-04-12/0200
Log Start Time => 2011:04:12:02
Setting Job Name => NgEventETLJob 2011:04:12:02 2011:04:12:03
Output Path => etl_out
Fetching From URL => http://partner.plusplus.com/admin/products.json
isProduction => false
11/04/12 10:18:14 INFO etl.NgEventETLJob: Setting plus.json.games.table
11/04/12 10:18:14 INFO mapred.FileInputFormat: Total input paths to process : 4
11/04/12 10:18:15 INFO mapred.JobClient: Running job: job_201104081805_0001
11/04/12 10:18:16 INFO mapred.JobClient:  map 0% reduce 0%
11/04/12 10:18:28 INFO mapred.JobClient:  map 20% reduce 0%
11/04/12 10:18:29 INFO mapred.JobClient:  map 40% reduce 0%
11/04/12 10:18:35 INFO mapred.JobClient:  map 80% reduce 0%
11/04/12 10:18:39 INFO mapred.JobClient:  map 100% reduce 0%
11/04/12 10:18:43 INFO mapred.JobClient:  map 100% reduce 26%
11/04/12 10:18:46 INFO mapred.JobClient: Task Id : attempt_201104081805_0001_r_000000_0, Status
: FAILED
11/04/12 10:18:47 INFO mapred.JobClient:  map 100% reduce 0%
11/04/12 10:18:57 INFO mapred.JobClient: Task Id : attempt_201104081805_0001_r_000000_1, Status
: FAILED
java.lang.ArrayIndexOutOfBoundsException: 3
	at org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364)
	at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229)
	at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
	at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206)
	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
	at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
	at org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86)
	at org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68)
	at org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136)
	at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076)
	at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246)
	at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242)
	at org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47)
	at com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:39)
	at com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:1)
	at org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60)
	at org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30)
	at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
	at org.apache.hadoop.mapred.Child.main(Child.java:234)

11/04/12 10:19:05 INFO mapred.JobClient:  map 100% reduce 26%
11/04/12 10:19:08 INFO mapred.JobClient: Task Id : attempt_201104081805_0001_r_000000_2, Status
: FAILED
java.lang.ArrayIndexOutOfBoundsException: 3
	at org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364)
	at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229)
	at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
	at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206)
	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
	at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
	at org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86)
	at org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68)
	at org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136)
	at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076)
	at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246)
	at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242)
	at org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47)
	at com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:39)
	at com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:1)
	at org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60)
	at org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30)
	at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
	at org.apache.hadoop.mapred.Child.main(Child.java:234)

11/04/12 10:19:10 INFO mapred.JobClient:  map 100% reduce 0%
11/04/12 10:19:22 INFO mapred.JobClient: Job complete: job_201104081805_0001
11/04/12 10:19:22 INFO mapred.JobClient: Counters: 31
11/04/12 10:19:22 INFO mapred.JobClient:   com.ngmoco.ngpipes.utils.NgPipesGlobals$EventClassCounter
11/04/12 10:19:22 INFO mapred.JobClient:     PLUS_EVENT=249
11/04/12 10:19:22 INFO mapred.JobClient:     REV_EVENT=1
11/04/12 10:19:22 INFO mapred.JobClient:     PC_REV_EVENT=1
11/04/12 10:19:22 INFO mapred.JobClient:   Job Counters 
11/04/12 10:19:22 INFO mapred.JobClient:     Launched reduce tasks=4
11/04/12 10:19:22 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=34290
11/04/12 10:19:22 INFO mapred.JobClient:     Total time spent by all reduces waiting after
reserving slots (ms)=0
11/04/12 10:19:22 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving
slots (ms)=0
11/04/12 10:19:22 INFO mapred.JobClient:     Launched map tasks=5
11/04/12 10:19:22 INFO mapred.JobClient:     Data-local map tasks=5
11/04/12 10:19:22 INFO mapred.JobClient:     Failed reduce tasks=1
11/04/12 10:19:22 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=53407
11/04/12 10:19:22 INFO mapred.JobClient:   com.ngmoco.ngpipes.etl.NgEventETLMapper$EventSourceTypes
11/04/12 10:19:22 INFO mapred.JobClient:     PLUS_SERVER=222
11/04/12 10:19:22 INFO mapred.JobClient:     PLUS_CLIENT=28
11/04/12 10:19:22 INFO mapred.JobClient:   FileSystemCounters
11/04/12 10:19:22 INFO mapred.JobClient:     HDFS_BYTES_READ=472855
11/04/12 10:19:22 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=1164803
11/04/12 10:19:22 INFO mapred.JobClient:   com.ngmoco.ngpipes.etl.NgEventETLMapper$Event
11/04/12 10:19:22 INFO mapred.JobClient:     ERR_NO_AFAM=133
11/04/12 10:19:22 INFO mapred.JobClient:     ERR_NULL_VALUE=109
11/04/12 10:19:22 INFO mapred.JobClient:     DISCARDED_EVENTS=1058
11/04/12 10:19:22 INFO mapred.JobClient:     ERR_NO_PUBL=112
11/04/12 10:19:22 INFO mapred.JobClient:     ERR_MAPPING_ASKU_TO_AFAM=676
11/04/12 10:19:22 INFO mapred.JobClient:     ERR_NO_ASKU=225
11/04/12 10:19:22 INFO mapred.JobClient:     ERR_EMPTY_MAP=182
11/04/12 10:19:22 INFO mapred.JobClient:     ERR_OTHER=45
11/04/12 10:19:22 INFO mapred.JobClient:   Map-Reduce Framework
11/04/12 10:19:22 INFO mapred.JobClient:     Combine output records=0
11/04/12 10:19:22 INFO mapred.JobClient:     Map input records=1281
11/04/12 10:19:22 INFO mapred.JobClient:     Spilled Records=205
11/04/12 10:19:22 INFO mapred.JobClient:     Map output bytes=41281
11/04/12 10:19:22 INFO mapred.JobClient:     Map input bytes=468793
11/04/12 10:19:22 INFO mapred.JobClient:     Combine input records=0
11/04/12 10:19:22 INFO mapred.JobClient:     Map output records=205
11/04/12 10:19:22 INFO mapred.JobClient:     SPLIT_RAW_BYTES=889
11/04/12 10:19:22 INFO mapred.JobClient: Job Failed: NA
Exception in thread "main" java.io.IOException: Job failed!
	at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1246)
	at com.ngmoco.ngpipes.etl.NgEventETLJob.runJob(NgEventETLJob.java:160)
	at com.ngmoco.ngpipes.etl.NgEventETLJob.run(NgEventETLJob.java:108)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
	at com.ngmoco.ngpipes.etl.NgEventETLJob.main(NgEventETLJob.java:189)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
cloudera@cloudera-demo:~/src/ngpipes-etl/dist$ hadoop jar ngpipesjobs.jar com.ngmoco.ngpipes.etl.NgEventETLJob
input/etl/test_avro_bugfix/2011-04-12/0200 etl_out avro/ngpipes-events.avdl
Input Path => input/etl/test_avro_bugfix/2011-04-12/0200
Log Start Time => 2011:04:12:02
Setting Job Name => NgEventETLJob 2011:04:12:02 2011:04:12:03
Output Path => etl_out
Fetching From URL => http://partner.plusplus.com/admin/products.json
isProduction => false
11/04/12 10:30:33 INFO etl.NgEventETLJob: Setting plus.json.games.table
11/04/12 10:30:34 INFO mapred.FileInputFormat: Total input paths to process : 4
11/04/12 10:30:34 INFO mapred.JobClient: Running job: job_201104081805_0002
11/04/12 10:30:35 INFO mapred.JobClient:  map 0% reduce 0%
11/04/12 10:30:44 INFO mapred.JobClient:  map 40% reduce 0%
11/04/12 10:30:51 INFO mapred.JobClient:  map 60% reduce 0%
11/04/12 10:30:52 INFO mapred.JobClient:  map 80% reduce 0%
11/04/12 10:30:55 INFO mapred.JobClient:  map 100% reduce 0%
11/04/12 10:31:00 INFO mapred.JobClient:  map 100% reduce 33%
11/04/12 10:31:03 INFO mapred.JobClient: Task Id : attempt_201104081805_0002_r_000000_0, Status
: FAILED
java.lang.ArrayIndexOutOfBoundsException: 3
	at org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364)
	at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:246)
	at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
	at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:223)
	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:123)
	at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:147)
	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:119)
	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:110)
	at org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86)
	at org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68)
	at org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136)
	at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076)
	at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246)
	at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242)
	at org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47)
	at com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:39)
	at com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:1)
	at org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60)
	at org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30)
	at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
	at org.apache.hadoop.mapred.Child.main(Child.java:234)

11/04/12 10:31:04 INFO mapred.JobClient:  map 100% reduce 0%
11/04/12 10:31:11 INFO mapred.JobClient:  map 100% reduce 33%
11/04/12 10:31:14 INFO mapred.JobClient: Task Id : attempt_201104081805_0002_r_000000_1, Status
: FAILED
java.lang.ArrayIndexOutOfBoundsException: 3
	at org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364)
	at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:246)
	at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
	at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:223)
	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:123)
	at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:147)
	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:119)
	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:110)
	at org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86)
	at org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68)
	at org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136)
	at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076)
	at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246)
	at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242)
	at org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47)
	at com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:39)
	at com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:1)
	at org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60)
	at org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30)
	at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
	at org.apache.hadoop.mapred.Child.main(Child.java:234)

11/04/12 10:31:16 INFO mapred.JobClient:  map 100% reduce 0%
11/04/12 10:31:26 INFO mapred.JobClient: Task Id : attempt_201104081805_0002_r_000000_2, Status
: FAILED
java.lang.ArrayIndexOutOfBoundsException: 3
	at org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364)
	at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:246)
	at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
	at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:223)
	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:123)
	at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:147)
	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:119)
	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:110)
	at org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86)
	at org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68)
	at org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136)
	at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076)
	at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246)
	at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242)
	at org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47)
	at com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:39)
	at com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:1)
	at org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60)
	at org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30)
	at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
	at org.apache.hadoop.mapred.Child.main(Child.java:234)

11/04/12 10:31:34 INFO mapred.JobClient:  map 100% reduce 13%
11/04/12 10:31:38 INFO mapred.JobClient:  map 100% reduce 0%
11/04/12 10:31:38 INFO mapred.JobClient: Job complete: job_201104081805_0002
11/04/12 10:31:38 INFO mapred.JobClient: Counters: 31
11/04/12 10:31:38 INFO mapred.JobClient:   com.ngmoco.ngpipes.utils.NgPipesGlobals$EventClassCounter
11/04/12 10:31:38 INFO mapred.JobClient:     PLUS_EVENT=249
11/04/12 10:31:38 INFO mapred.JobClient:     REV_EVENT=1
11/04/12 10:31:38 INFO mapred.JobClient:     PC_REV_EVENT=1
11/04/12 10:31:38 INFO mapred.JobClient:   Job Counters 
11/04/12 10:31:38 INFO mapred.JobClient:     Launched reduce tasks=4
11/04/12 10:31:38 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=32438
11/04/12 10:31:38 INFO mapred.JobClient:     Total time spent by all reduces waiting after
reserving slots (ms)=0
11/04/12 10:31:38 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving
slots (ms)=0
11/04/12 10:31:38 INFO mapred.JobClient:     Launched map tasks=5
11/04/12 10:31:38 INFO mapred.JobClient:     Data-local map tasks=5
11/04/12 10:31:38 INFO mapred.JobClient:     Failed reduce tasks=1
11/04/12 10:31:38 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=52425
11/04/12 10:31:38 INFO mapred.JobClient:   com.ngmoco.ngpipes.etl.NgEventETLMapper$EventSourceTypes
11/04/12 10:31:38 INFO mapred.JobClient:     PLUS_SERVER=222
11/04/12 10:31:38 INFO mapred.JobClient:     PLUS_CLIENT=28
11/04/12 10:31:38 INFO mapred.JobClient:   FileSystemCounters
11/04/12 10:31:38 INFO mapred.JobClient:     HDFS_BYTES_READ=472855
11/04/12 10:31:38 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=1164803
11/04/12 10:31:38 INFO mapred.JobClient:   com.ngmoco.ngpipes.etl.NgEventETLMapper$Event
11/04/12 10:31:38 INFO mapred.JobClient:     ERR_NO_AFAM=133
11/04/12 10:31:38 INFO mapred.JobClient:     ERR_NULL_VALUE=109
11/04/12 10:31:38 INFO mapred.JobClient:     DISCARDED_EVENTS=1058
11/04/12 10:31:38 INFO mapred.JobClient:     ERR_NO_PUBL=112
11/04/12 10:31:38 INFO mapred.JobClient:     ERR_MAPPING_ASKU_TO_AFAM=676
11/04/12 10:31:38 INFO mapred.JobClient:     ERR_NO_ASKU=225
11/04/12 10:31:38 INFO mapred.JobClient:     ERR_EMPTY_MAP=182
11/04/12 10:31:38 INFO mapred.JobClient:     ERR_OTHER=45
11/04/12 10:31:38 INFO mapred.JobClient:   Map-Reduce Framework
11/04/12 10:31:38 INFO mapred.JobClient:     Combine output records=0
11/04/12 10:31:38 INFO mapred.JobClient:     Map input records=1281
11/04/12 10:31:38 INFO mapred.JobClient:     Spilled Records=205
11/04/12 10:31:38 INFO mapred.JobClient:     Map output bytes=41281
11/04/12 10:31:38 INFO mapred.JobClient:     Map input bytes=468793
11/04/12 10:31:38 INFO mapred.JobClient:     Combine input records=0
11/04/12 10:31:38 INFO mapred.JobClient:     Map output records=205
11/04/12 10:31:38 INFO mapred.JobClient:     SPLIT_RAW_BYTES=889
11/04/12 10:31:38 INFO mapred.JobClient: Job Failed: NA
Exception in thread "main" java.io.IOException: Job failed!
	at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1246)
	at com.ngmoco.ngpipes.etl.NgEventETLJob.runJob(NgEventETLJob.java:160)
	at com.ngmoco.ngpipes.etl.NgEventETLJob.run(NgEventETLJob.java:108)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
	at com.ngmoco.ngpipes.etl.NgEventETLJob.main(NgEventETLJob.java:189)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
===========================================================================================================================
 

> map reduce job for avro 1.5 generates ArrayIndexOutOfBoundsException
> --------------------------------------------------------------------
>
>                 Key: AVRO-792
>                 URL: https://issues.apache.org/jira/browse/AVRO-792
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.5.0
>         Environment: Mac with VMWare running Linux training-vm-Ubuntu
>            Reporter: ey-chih chow
>            Assignee: Thiruvalluvan M. G.
>            Priority: Blocker
>             Fix For: 1.5.1
>
>         Attachments: AVRO-792-2.patch, AVRO-792-3.patch, AVRO-792.patch
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> We have an avro map/reduce job used to be working with avro 1.4, but broken with avro
1.5.  The M/R job with avro 1.5 worked fine under our debugging environment, but broken when
we moved to a real cluster.  At one instance f testing, the job had 23 reducers.  Four of
them succeeded and the rest failed because of the ArrayIndexOutOfBoundsException generated.
 Here are two instances of the stack traces:
> =================================================================================
> java.lang.ArrayIndexOutOfBoundsException: -1576799025
> 	at org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364)
> 	at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229)
> 	at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
> 	at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206)
> 	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
> 	at org.apache.avro.generic.GenericDatumReader.readMap(GenericDatumReader.java:232)
> 	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:141)
> 	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
> 	at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
> 	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
> 	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
> 	at org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86)
> 	at org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68)
> 	at org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136)
> 	at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076)
> 	at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246)
> 	at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242)
> 	at org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47)
> 	at com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:46)
> 	at com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:1)
> 	at org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60)
> 	at org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30)
> 	at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:234)
> =====================================================================================================
> java.lang.ArrayIndexOutOfBoundsException: 40
> 	at org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364)
> 	at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229)
> 	at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
> 	at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206)
> 	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
> 	at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
> 	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
> 	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
> 	at org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86)
> 	at org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68)
> 	at org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136)
> 	at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076)
> 	at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246)
> 	at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242)
> 	at org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47)
> 	at com.ngmoco.ngpipes.sourcing.sessions.NgSessionReducer.reduce(NgSessionReducer.java:74)
> 	at com.ngmoco.ngpipes.sourcing.sessions.NgSessionReducer.reduce(NgSessionReducer.java:1)
> 	at org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60)
> 	at org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30)
> 	at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:234)
> =====================================================================================================
> The signature of our map() is:
> public void map(Utf8 input, AvroCollector<Pair<Utf8, GenericRecord>> collector,
Reporter reporter) throws IOException;
> and reduce() is:
> public void reduce(Utf8 key, Iterable<GenericRecord> values, AvroCollector<GenericRecord>
collector, Reporter reporter) throws IOException;
> All the GenericRecords are of the same schema.
> There are many changes in the area of serialization/de-serailization between avro 1.4
and 1.5, but could not figure out why the exceptions were generated. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message