gobblin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <amamanesarki....@orange.com>
Subject APPACHE GOBBLIN RUNTIMEEXCEPTION CAUSED BY WRITABLENAME AND SEQUENCEFILE
Date Thu, 05 Apr 2018 10:18:38 GMT
I'm trying to run Gobblin in yarn mode.
 I read data from a kafka topic and write it to hdfs.


 - THE PROBLEM
                The job creates workunits but produce no data.


 - THE ERRORS

    The only error i'm seeing in the _applog is:

                               2018-04-05 11:34:50 CEST ERROR [Thread-9] org.apache.helix.task.JobRebalancer
 - No available instance found for job!
    And this one that is not explicit:

                               2018-04-05 11:34:54 CEST ERROR [JobScheduler-0] org.apache.gobblin.cluster.GobblinHelixJobScheduler$RetriggeringJobCallable
 - Failed to run job gobblin_streaming
                               org.apache.gobblin.runtime.JobException: Failed to launch and
run job gobblin_streaming
                                                               at org.apache.gobblin.scheduler.JobScheduler.runJob(JobScheduler.java:489)
                                                               at org.apache.gobblin.cluster.GobblinHelixJobScheduler$RetriggeringJobCallable.call(GobblinHelixJobScheduler.java:303)
                                                               at org.apache.gobblin.cluster.GobblinHelixJobScheduler.runJob(GobblinHelixJobScheduler.java:273)
                                                               at org.apache.gobblin.cluster.GobblinHelixJobScheduler$NonScheduledJobRunner.run(GobblinHelixJobScheduler.java:439)
                                                               at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
                                                               at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
                                                               at java.lang.Thread.run(Thread.java:748)
                               Caused by: org.apache.gobblin.runtime.JobException: Job job_gobblin_streaming_1522920874045
failed
                                                               at org.apache.gobblin.runtime.AbstractJobLauncher.launchJob(AbstractJobLauncher.java:508)
                                                               at org.apache.gobblin.cluster.GobblinHelixJobLauncher.launchJob(GobblinHelixJobLauncher.java:305)
                                                               at org.apache.gobblin.scheduler.JobScheduler.runJob(JobScheduler.java:479)
                                                               ... 6 more
                               2018-04-05 11:34:54 CEST ERROR [JobScheduler-0] org.apache.gobblin.cluster.GobblinHelixJobScheduler$NonScheduledJobRunner
 - Failed to run job gobblin_streaming
                               org.apache.gobblin.runtime.JobException: Failed to run job
gobblin_streaming
                                                               at org.apache.gobblin.cluster.GobblinHelixJobScheduler$RetriggeringJobCallable.call(GobblinHelixJobScheduler.java:314)
                                                               at org.apache.gobblin.cluster.GobblinHelixJobScheduler.runJob(GobblinHelixJobScheduler.java:273)
                                                               at org.apache.gobblin.cluster.GobblinHelixJobScheduler$NonScheduledJobRunner.run(GobblinHelixJobScheduler.java:439)
                                                               at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
                                                               at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
                                                               at java.lang.Thread.run(Thread.java:748)
                               Caused by: org.apache.gobblin.runtime.JobException: Failed
to launch and run job gobblin_streaming
                                                               at org.apache.gobblin.scheduler.JobScheduler.runJob(JobScheduler.java:489)
                                                               at org.apache.gobblin.cluster.GobblinHelixJobScheduler$RetriggeringJobCallable.call(GobblinHelixJobScheduler.java:303)
                                                               ... 5 more
                               Caused by: org.apache.gobblin.runtime.JobException: Job job_gobblin_streaming_1522920874045
failed
                                                               at org.apache.gobblin.runtime.AbstractJobLauncher.launchJob(AbstractJobLauncher.java:508)
                                                               at org.apache.gobblin.cluster.GobblinHelixJobLauncher.launchJob(GobblinHelixJobLauncher.java:305)
                                                               at org.apache.gobblin.scheduler.JobScheduler.runJob(JobScheduler.java:479)
                                                               ... 6 more
                               2018-04-05 11:34:54 CEST INFO  [Thread-9] org.apache.helix.task.WorkflowRebalancer
 - Workflow is marked as deleted gobblin_streaming cleaning up the workflow context.
                               2018-04-05 11:34:54 CEST INFO  [Thread-9] org.apache.helix.task.WorkflowRebalancer
 - Cleaning up workflow: gobblin_streaming
                               2018-04-05 11:34:54 CEST INFO  [Thread-9] org.apache.helix.controller.rebalancer.util.RebalanceScheduler
 - Remove scheduled rebalance task at time 1522922676008 for resource: gobblin_streaming

    But After searching in the staging files(in the state.store.dir). I found this error:

                               java.lang.RuntimeException: java.io.IOException: WritableName
can't load class: gobblin.source.workunit.WorkUnit
                                                               at org.apache.hadoop.io.SequenceFile$Reader.getValueClass(SequenceFile.java:2110)
                                                               at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:2040)
                                                               at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1881)
                                                               at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1830)
                                                               at org.apache.hadoop.fs.shell.Display$TextRecordInputStream.<init>(Display.java:222)
                                                               at org.apache.hadoop.fs.shell.Display$Text.getInputStream(Display.java:152)
                                                               at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:101)
                                                               at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317)
                                                               at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289)
                                                               at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271)
                                                               at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255)
                                                               at org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:118)
                                                               at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
                                                               at org.apache.hadoop.fs.FsShell.run(FsShell.java:315)
                                                               at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
                                                               at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
                                                               at org.apache.hadoop.fs.FsShell.main(FsShell.java:372)
                               Caused by: java.io.IOException: WritableName can't load class:
gobblin.source.workunit.WorkUnit
                                                               at org.apache.hadoop.io.WritableName.getClass(WritableName.java:77)
                                                               at org.apache.hadoop.io.SequenceFile$Reader.getValueClass(SequenceFile.java:2108)
                                                               ... 16 more
                               Caused by: java.lang.ClassNotFoundException: Class gobblin.source.workunit.WorkUnit
not found
                                                               at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2185)
                                                               at org.apache.hadoop.io.WritableName.getClass(WritableName.java:75)



- MY GOBBLIN DIST IS :

                               I compiled it with -PhadoopVersion=2.6.0-CDH5.13.1 -PhiveVersion=1.1.0-CDH5.13.1
-Pkafka08Version=0.11.0.0
                                                  -Pkafka09Version=0.11.0.0 --PavroVersion=1.8.2

- MY CONFIGURATION IS :


                               gobblin.streaming.kafka.topic.key.deserializer=org.apache.kafka.common.serialization.StringDeserializer
                               gobblin.streaming.kafka.topic.value.deserializer=org.apache.kafka.common.serialization.StringDeserializer


                               source.class=org.apache.gobblin.source.extractor.extract.kafka.KafkaSimpleStreamingSource

                               gobblin.streaming.kafka.topic.singleton=XXX

                               extract.namespace=org.apache.gobblin.extract.kafka



                               writer.destination.type=HDFS
                               writer.output.format="txt"
                               writer.builder.class=org.apache.gobblin.writer.SimpleDataWriterBuilder
                               writer.file.path.type=tablename
                               writer.partition.column.name=header.time


                               data.publisher.type=org.apache.gobblin.publisher.BaseDataPublisher
                               data.publisher.final.dir=${gobblin.yarn.work.dir}/job-output

                               mr.job.max.mappers=12
                               topic.whitelist=XXX
                               bootstrap.with.offset=latest

                               kafka.brokers="xxxxxx"
                               #partition.assignment.strategy=range
                               yarn.client.max-nodemanagers-proxies=12
                               yarn.client.max-cached-nodemanagers-proxies=12
                               task.data.root.dir=${gobblin.yarn.work.dir}/task-root-dir
                               state.store.dir=${gobblin.yarn.work.dir}/state-store
                               #job.commit.policy=partial
                               data.publisher.metadata.output.dir=xxxx
                               failure.log.dir=xxx
                               #state.store.enabled=false


COULD SOMEONE HELP ME RESOLVE THIS ISSUE ?

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees
et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par
erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant
susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may
be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message
and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed
or falsified.
Thank you.


Mime
View raw message