flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suresh V <verdi...@gmail.com>
Subject RabbitMQ heartbeat lost
Date Thu, 06 Apr 2017 20:40:09 GMT
We have 4 agents with RabbitMQ sources, and we are seeing heartbeat lost
errors in Flume logs occasionally when there is a load on the RMQ queues.

Now, we think this is probably because RMQ is under load and not responding
to Flume fast enough; however, is there any way we can handle this from
Flume end, by increasing the heartbeat timeout setting?

This happens only when there is a load in RMQ, and RMQ seems to stabilize
after some time; if we kill the Flume agent and restart, it works fine and
consumes the messages.

Any inputs on handling such scenario? Also, the Flume agent continues to
run despite this error, but doesn't consume any more messages. Is there a
way to have Flume abort when this happens?

Thank you for any help with this. The error from the logs is as below.

Suresh.

Exception in thread "RabbitMQ Consumer #0"
com.rabbitmq.client.ShutdownSignalException: connection error

        at
com.rabbitmq.client.QueueingConsumer.handle(QueueingConsumer.java:198)

        at
com.rabbitmq.client.QueueingConsumer.nextDelivery(QueueingConsumer.java:215)

        at com.aweber.flume.source.rabbitmq.Consumer.run(Consumer.java:164)

        at java.lang.Thread.run(Thread.java:745)

Caused by: com.rabbitmq.client.MissedHeartbeatException: Heartbeat missing
with heartbeat = 60 seconds

        at
com.rabbitmq.client.impl.AMQConnection.handleSocketTimeout(AMQConnection.java:597)

        at
com.rabbitmq.client.impl.AMQConnection.access$600(AMQConnection.java:65)

        at
com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:560)

        ... 1 more

Mime
View raw message