flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Denes Arvay <de...@cloudera.com>
Subject Re: Alerts when Flume agent fails
Date Mon, 27 Feb 2017 15:35:34 GMT
Hi Suresh,

Sink:
- BatchCompleteCount
Number of processed "complete" batches where the number of events in the
batch reached the configured batch size.

- BatchUnderflowCount
Number of batches processed where the number of events is less than the
configured maximum batch size. This can happen when the channel becomes
empty and the already processed events will be flushed (and the transaction
will be committed) even though the batchsize hasn't been reached.

Source:
- AppendBatchReceivedCount and AppendBatchAcceptedCount:
If the source supports batching it keeps track of the number of the
received and processed batches respectively. (Processed means the events
were forwarded to the channel via the channelprocessor.)

- AppendReceivedCount and AppendAcceptedCount
This is the number of the received and processed single (not batched)
events respectively.

Please keep in mind that not all of the components support all the counters.

Regards,
Denes



> -Suresh.
>
>
> On Sun, Feb 26, 2017 at 10:40 PM, iain wright <iainwrig@gmail.com> wrote:
>
> metrics endpoint polling every 60s is probably the best, alert on nodata >
> N minutes or any non http 200 response
>
> alternatively you could use something like monit
> <https://mmonit.com/monit/> to monitor the process is running ,but this
> won't handle an OOM flume agent, in which case you'd need to add
> -XX:OnOutOfMemoryError="kill -9 %p", to make the sure the process being
> monitored dies when the jvm encounters OOM
>
> with metrics polling you get the added benefit of being able to detect
> pressure or problems before they bubble up into larger problems (IE:
> Channelsize increasing over N minutes, and successfulsinkcount not
> changing) i dont remember the exact names of the metrics it's been awhile
>
> the metric keys seemed to explain it well enough when i was using this in
> the past, are there any specific keys in the response from /metrics you
> don't understand?
>
> --
> Iain Wright
>
> This email message is confidential, intended only for the recipient(s)
> named above and may contain information that is privileged, exempt from
> disclosure under applicable law. If you are not the intended recipient, do
> not disclose or disseminate the message to anyone except the intended
> recipient. If you have received this message in error, or are not the named
> recipient(s), please immediately notify the sender by return email, and
> delete all copies of this message.
>
> On Sun, Feb 26, 2017 at 7:37 PM, Suresh V <verditer@gmail.com> wrote:
>
> Thank you.
>
> Additionally, where can I find details about each metric in the json
> output on port 41414? I could not find detailed description of each metric
> and what it means, from the user guide.
>
> Thank you
> Suresh.
>
>
> On Sun, Feb 26, 2017 at 9:33 PM, Sharninder Khera <sharninder@gmail.com>
> wrote:
>
> Set up scripts to send alerts sooner ? There isn't a built in way in flume
> so you will have to setup monitoring separately
>
>
>
>
>
> On Mon, Feb 27, 2017 at 8:57 AM +0530, "Suresh V" <verditer@gmail.com>
> wrote:
>
> Hello,
>
> Is there a way to set up an alert mechanism by email immediately when a
> flume agent fails due to any reason?
>
> At the moment, we have scripts sending the port 41414 JSON metrics by
> email every hour, but it would be good to know as soon as an agent fails.
>
> Appreciate any help.
>
> Thank you
> Suresh.
>
>
>
>
>

Mime
View raw message