incubator-flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Luban <michael.lu...@gmail.com>
Subject Heartbeats backing up when bootstrapping embedded flume node
Date Thu, 11 Aug 2011 16:25:41 GMT
Hello,

I have an issue trying to bootstrap an embedded flume node within a custom
service.
I'm using the same sort of start physical node, configure and spawn logical
node methodology that flumebase employs:

https://github.com/flumebase/flumebase/blob/master/src/main/java/com/odiago/flumebase/flume/EmbeddedFlumeConfig.java

When using the cdh3u1 release and configured to either use an embedded or an
external flume master, I get (upon first time execution) a "Hearbeats are
backing up" log message, followed by what looks to be the decommissioning of
my logical node ("Forcing driver to exit uncleanly")
Subsequent manual configuration of the embedded node through the master with
the same sink (custom plugin) and source (rpcSource(54010)) picks up the
configuration correctly and the embedded node works fine.
I'm trying to eliminate or understand the root cause of the "Forcing driver
to exit uncleanly" message when the logical node is configured and the
spawned into the physical node for the first time.

-----------Log snippet---------------

01:59:02.457 [Heartbeat] INFO  c.e.t.flume.TimeSeriesAggregatorSink -
EventSink builder() : building event sink
01:59:02.457 [Heartbeat] INFO  c.c.flume.agent.LogicalNodeManager - creating
new logical node timeseries-aggregator-host.local.domain
01:59:02.470 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (HELLO,Wed Dec 31
19:00:00 EST 1969)
01:59:02.472 [logicalNode timeseries-aggregator-host.local.domain-32] INFO
 c.c.f.h.thrift.ThriftEventSource - Starting blocking thread pool server on
port 54010...
01:59:02.475 [logicalNode timeseries-aggregator-host.local.domain-32] DEBUG
c.c.f.core.connector.DirectDriver - Starting driver ThriftEventSource |
TimeSeriesAggregatorSink
01:59:02.479 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
01:59:02.475 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
another heartbeat
01:59:02.480 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode -
Attempt to load config com.cloudera.flume.conf.FlumeConfigData@449afc
01:59:02.480 [Check config] INFO  c.e.t.flume.TimeSeriesAggregatorSink -
EventSink builder() : building event sink
01:59:07.482 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
19:00:00 EST 1969)
01:59:07.484 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
01:59:12.487 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
19:00:00 EST 1969)
01:59:12.489 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
are backing up, currently behind by 1 heartbeats
01:59:12.490 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
01:59:17.493 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
19:00:00 EST 1969)
01:59:17.494 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
are backing up, currently behind by 2 heartbeats
01:59:17.495 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
01:59:22.501 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
19:00:00 EST 1969)
01:59:22.502 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
are backing up, currently behind by 3 heartbeats
01:59:22.502 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
01:59:27.505 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
19:00:00 EST 1969)
01:59:27.506 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
are backing up, currently behind by 4 heartbeats
01:59:27.506 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
01:59:32.509 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
19:00:00 EST 1969)
01:59:32.511 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
are backing up, currently behind by 5 heartbeats
01:59:32.511 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
01:59:37.514 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
19:00:00 EST 1969)
01:59:37.516 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
are backing up, currently behind by 6 heartbeats
01:59:37.516 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
01:59:42.519 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
19:00:00 EST 1969)
01:59:42.521 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
are backing up, currently behind by 7 heartbeats
01:59:42.521 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
01:59:47.524 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
19:00:00 EST 1969)
01:59:47.526 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
are backing up, currently behind by 8 heartbeats
01:59:47.526 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
01:59:52.529 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
19:00:00 EST 1969)
01:59:52.532 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
are backing up, currently behind by 9 heartbeats
01:59:52.532 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
01:59:57.535 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31
19:00:00 EST 1969)
01:59:57.537 [Heartbeat] WARN  c.c.flume.agent.LivenessManager - Heartbeats
are backing up, currently behind by 10 heartbeats
01:59:57.537 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
02:00:02.480 [Check config] ERROR com.cloudera.flume.agent.LogicalNode -
Forcing driver to exit uncleanly
02:00:02.481 [Check config] INFO  com.cloudera.flume.agent.LogicalNode -
Node config successfully set to
com.cloudera.flume.conf.FlumeConfigData@449afc
02:00:02.481 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
another heartbeat
02:00:02.481 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
(Thu Aug 11 01:58:57 EDT 2011).
02:00:02.481 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
another heartbeat
02:00:02.481 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
(Thu Aug 11 01:58:57 EDT 2011).
02:00:02.481 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
another heartbeat
02:00:02.481 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
(Thu Aug 11 01:58:57 EDT 2011).
02:00:02.481 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
another heartbeat
02:00:02.481 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
(Thu Aug 11 01:58:57 EDT 2011).
02:00:02.481 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
another heartbeat
02:00:02.482 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
(Thu Aug 11 01:58:57 EDT 2011).
02:00:02.482 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
another heartbeat
02:00:02.482 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
(Thu Aug 11 01:58:57 EDT 2011).
02:00:02.482 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
another heartbeat
02:00:02.482 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
(Thu Aug 11 01:58:57 EDT 2011).
02:00:02.482 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
another heartbeat
02:00:02.482 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
(Thu Aug 11 01:58:57 EDT 2011).
02:00:02.482 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
another heartbeat
02:00:02.482 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
(Thu Aug 11 01:58:57 EDT 2011).
02:00:02.482 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
another heartbeat
02:00:02.483 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
(Thu Aug 11 01:58:57 EDT 2011).
02:00:02.483 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking
another heartbeat
02:00:02.483 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do
nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current
(Thu Aug 11 01:58:57 EDT 2011).
02:00:02.483 [Thread-4] DEBUG c.c.flume.reporter.ReportEvent - Ignoring null
string metric message
Aug 11, 2011 2:00:02 AM
com.twitter.common.application.modules.AppLauncherModule$1 uncaughtException
SEVERE: Uncaught exception from Thread[Thread-4,5,main]
java.lang.NullPointerException
    at
com.cloudera.flume.handlers.thrift.ThriftEventSource.getMetrics(ThriftEventSource.java:99)
    at
com.cloudera.flume.core.EventSource$Base.getReports(EventSource.java:186)
    at com.cloudera.flume.agent.LogicalNode.getReports(LogicalNode.java:309)
    at
com.cloudera.flume.reporter.MasterReportPusher$PusherThread.querySrcSinkReports(MasterReportPusher.java:102)
    at
com.cloudera.flume.reporter.MasterReportPusher$PusherThread.sendReports(MasterReportPusher.java:110)
    at
com.cloudera.flume.reporter.MasterReportPusher$PusherThread.run(MasterReportPusher.java:119)
02:00:02.487 [logicalNode timeseries-aggregator-host.local.domain-34] INFO
 c.c.f.h.thrift.ThriftEventSource - Starting blocking thread pool server on
port 54010...
02:00:02.488 [logicalNode timeseries-aggregator-host.local.domain-34] ERROR
o.a.t.server.TSaneThreadPoolServer - Error occurred during listening.
org.apache.thrift.transport.TTransportException: Could not create
ServerSocket on address 0.0.0.0/0.0.0.0:54010.
    at
org.apache.thrift.transport.TSaneServerSocket.bind(TSaneServerSocket.java:110)
~[flume-core-0.9.4-cdh3u1.jar:0.6.1]
    at
org.apache.thrift.transport.TSaneServerSocket.listen(TSaneServerSocket.java:116)
~[flume-core-0.9.4-cdh3u1.jar:0.6.1]
    at
org.apache.thrift.server.TSaneThreadPoolServer.start(TSaneThreadPoolServer.java:162)
~[flume-core-0.9.4-cdh3u1.jar:0.6.1]
    at
com.cloudera.flume.handlers.thrift.ThriftEventSource.open(ThriftEventSource.java:151)
[flume-core-0.9.4-cdh3u1.jar:na]
    at
com.cloudera.flume.core.connector.DirectDriver$PumperThread.run(DirectDriver.java:87)
[flume-core-0.9.4-cdh3u1.jar:na]
02:00:02.488 [logicalNode timeseries-aggregator-host.local.domain-34] DEBUG
c.c.f.core.connector.DirectDriver - Starting driver ThriftEventSource |
TimeSeriesAggregatorSink
02:00:02.539 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Thu Aug 11
01:58:57 EDT 2011)
02:00:02.539 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
02:00:02.555 [logicalNode timeseries-aggregator-host.local.domain-32] ERROR
c.c.f.core.connector.DirectDriver - Closing down due to exception during
append calls
java.io.IOException: Waiting for queue element was interrupted! null
    at
com.cloudera.flume.handlers.thrift.ThriftEventSource.next(ThriftEventSource.java:222)
~[flume-core-0.9.4-cdh3u1.jar:na]
    at
com.cloudera.flume.core.connector.DirectDriver$PumperThread.run(DirectDriver.java:105)
~[flume-core-0.9.4-cdh3u1.jar:na]
Caused by: java.lang.InterruptedException: null
    at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
~[na:1.6.0_22]
    at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2094)
~[na:1.6.0_22]
    at
java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:423)
~[na:1.6.0_22]
    at
com.cloudera.flume.handlers.thrift.ThriftEventSource.next(ThriftEventSource.java:209)
~[flume-core-0.9.4-cdh3u1.jar:na]
    ... 1 common frames omitted
02:00:02.557 [logicalNode timeseries-aggregator-host.local.domain-32] INFO
 c.c.f.core.connector.DirectDriver - Connector logicalNode
timeseries-aggregator-host.local.domain-32 exited with error: Waiting for
queue element was interrupted! null
java.io.IOException: Waiting for queue element was interrupted! null
    at
com.cloudera.flume.handlers.thrift.ThriftEventSource.next(ThriftEventSource.java:222)
~[flume-core-0.9.4-cdh3u1.jar:na]
    at
com.cloudera.flume.core.connector.DirectDriver$PumperThread.run(DirectDriver.java:105)
~[flume-core-0.9.4-cdh3u1.jar:na]
Caused by: java.lang.InterruptedException: null
    at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
~[na:1.6.0_22]
    at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2094)
~[na:1.6.0_22]
    at
java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:423)
~[na:1.6.0_22]
    at
com.cloudera.flume.handlers.thrift.ThriftEventSource.next(ThriftEventSource.java:209)
~[flume-core-0.9.4-cdh3u1.jar:na]
    ... 1 common frames omitted
02:00:02.558 [logicalNode timeseries-aggregator-host.local.domain-32] INFO
 c.c.f.h.thrift.ThriftEventSource - Closed server on port 54010...
02:00:02.558 [logicalNode timeseries-aggregator-host.local.domain-32] INFO
 c.c.f.h.thrift.ThriftEventSource - Queue still has 0 elements ...
02:00:02.558 [logicalNode timeseries-aggregator-host.local.domain-32] ERROR
c.c.f.core.connector.DirectDriver - Exiting driver logicalNode
timeseries-aggregator-host.local.domain-32 in error state ThriftEventSource
| TimeSeriesAggregatorSink because Waiting for queue element was
interrupted! null
02:00:07.540 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Thu Aug 11
01:58:57 EDT 2011)
02:00:07.541 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
02:00:12.542 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Thu Aug 11
01:58:57 EDT 2011)
02:00:12.542 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
02:00:17.545 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Thu Aug 11
01:58:57 EDT 2011)
02:00:17.546 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
02:00:22.549 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Thu Aug 11
01:58:57 EDT 2011)
02:00:22.549 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
02:00:27.552 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager -
Heartbeat from host:host.local.domain
logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Thu Aug 11
01:58:57 EDT 2011)
02:00:27.553 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent
acks waiting for master: {}
---------------------------

Mime
View raw message