samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick Mangi <r...@chartbeat.com>
Subject Re: Problems upgrading Job
Date Thu, 12 Nov 2015 18:10:54 GMT
Hi Yi,

I pulled from master and built this morning.

Yes, that’s the output from JobRunner. I also tried setting a job.id to see if this was
an issue migrating from an old task checkpoint topic but I got the same result.

Would you like me to open a jira ticket?

Thanks,

Rick



> On Nov 12, 2015, at 12:59 PM, Yi Pan <nickpan47@gmail.com> wrote:
> 
> Hi, Rick,
> 
> Did you get the fix in SAMZA-723 in your test? And could you confirm that
> the errors are from JobRunner log?
> 
> -Yi
> 
> On Thu, Nov 12, 2015 at 8:48 AM, Rick Mangi <rick@chartbeat.com> wrote:
> 
>> Hi,
>> 
>> I’m trying to migrate our samza jobs to 0.10.0 snapshot (built against the
>> latest). Everything works fine running locally (although I had to make some
>> changes to the local grid’s kafka since the checkpointing seems to require
>> replication_factor > 1) but when I deploy it against my production yarn
>> cluster I get these errors.
>> 
>> [yarnmaster01] out: 2015-11-12 10:40:53 ZkClient [INFO] zookeeper state
>> changed (SyncConnected)
>> [yarnmaster01] out: 2015-11-12 10:40:53 ZkEventThread [INFO] Terminate
>> ZkClient event thread.
>> [yarnmaster01] out: 2015-11-12 10:40:53 ZooKeeper [INFO] Session:
>> 0x250233cdf57f2fa closed
>> [yarnmaster01] out: 2015-11-12 10:40:53 ClientCnxn [INFO] EventThread shut
>> down
>> [yarnmaster01] out: 2015-11-12 10:40:53 KafkaSystemAdmin [INFO]
>> Coordinator stream __samza_coordinator_metrics-reporter_1 already exists.
>> [yarnmaster01] out: 2015-11-12 10:40:53 JobRunner [INFO] Storing config in
>> coordinator stream.
>> [yarnmaster01] out: 2015-11-12 10:40:53 CoordinatorStreamSystemProducer
>> [INFO] Starting coordinator stream producer.
>> [yarnmaster01] out: 2015-11-12 10:40:53 KafkaSystemProducer [INFO]
>> Creating a new producer for system mykafka.
>> [yarnmaster01] out: 2015-11-12 10:40:53 ProducerConfig [INFO]
>> ProducerConfig values:
>> [yarnmaster01] out:     value.serializer = class
>> org.apache.kafka.common.serialization.ByteArraySerializer
>> [yarnmaster01] out:     key.serializer = class
>> org.apache.kafka.common.serialization.ByteArraySerializer
>> [yarnmaster01] out:     block.on.buffer.full = true
>> [yarnmaster01] out:     retry.backoff.ms = 100
>> [yarnmaster01] out:     buffer.memory = 33554432
>> [yarnmaster01] out:     batch.size = 16384
>> [yarnmaster01] out:     metrics.sample.window.ms = 30000
>> [yarnmaster01] out:     metadata.max.age.ms = 300000
>> [yarnmaster01] out:     receive.buffer.bytes = 32768
>> [yarnmaster01] out:     timeout.ms = 30000
>> [yarnmaster01] out:     max.in.flight.requests.per.connection = 1
>> [yarnmaster01] out:     bootstrap.servers = [
>> devstream01.chartbeat.net:9092]
>> [yarnmaster01] out:     metric.reporters = []
>> [yarnmaster01] out:     client.id =
>> samza_producer-metrics_reporter-1-1447342853273-4
>> [yarnmaster01] out:     compression.type = none
>> [yarnmaster01] out:     retries = 2147483647
>> [yarnmaster01] out:     max.request.size = 1048576
>> [yarnmaster01] out:     send.buffer.bytes = 131072
>> [yarnmaster01] out:     acks = 1
>> [yarnmaster01] out:     reconnect.backoff.ms = 10
>> [yarnmaster01] out:     linger.ms = 0
>> [yarnmaster01] out:     metrics.num.samples = 2
>> [yarnmaster01] out:     metadata.fetch.timeout.ms = 60000
>> [yarnmaster01] out:
>> [yarnmaster01] out: 2015-11-12 10:40:53 ProducerConfig [WARN] The
>> configuration batch.num.messages = null was supplied but isn't a known
>> config.
>> [yarnmaster01] out: 2015-11-12 10:40:53 ProducerConfig [WARN] The
>> configuration producer.type = null was supplied but isn't a known config.
>> [yarnmaster01] out: Exception in thread "main"
>> org.apache.samza.SamzaException:
>> org.apache.kafka.common.errors.TimeoutException: Failed to update metadata
>> after 60000 ms.
>> [yarnmaster01] out:     at
>> org.apache.samza.coordinator.stream.CoordinatorStreamSystemProducer.send(CoordinatorStreamSystemProducer.java:115)
>> [yarnmaster01] out:     at
>> org.apache.samza.coordinator.stream.CoordinatorStreamSystemProducer.writeConfig(CoordinatorStreamSystemProducer.java:132)
>> [yarnmaster01] out:     at
>> org.apache.samza.job.JobRunner.run(JobRunner.scala:85)
>> [yarnmaster01] out:     at
>> org.apache.samza.job.JobRunner$.main(JobRunner.scala:43)
>> [yarnmaster01] out:     at
>> org.apache.samza.job.JobRunner.main(JobRunner.scala)
>> [yarnmaster01] out: Caused by:
>> org.apache.kafka.common.errors.TimeoutException: Failed to update metadata
>> after 60000 ms.
>> [yarnmaster01] out:
>> 
>> 
>> Warning: run() received nonzero return code 1 while executing
>> './bin/run-job.sh
>> -config-factory=org.apache.samza.config.factories.PropertiesConfigFactory
>> --config-path=file://$PWD/conf/metrics_reporter.properties'!
>> 
>> 
>> This looks similar to https://issues.apache.org/jira/browse/SAMZA-560 but
>> I’m not using a StreamAppender in log4j.
>> 
>> Any ideas? My first thought is that I might have to delete the existing
>> checkpoint topics but that would mean we can’t migrate completely until the
>> 10.0 release unless we want to run snapshot code in production.
>> 
>> Thanks!
>> 
>> Rick
>> 
>> 
>> 


Mime
View raw message