flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hari Shreedharan <hshreedha...@cloudera.com>
Subject Re: Data loss by HDFS SINK metrics
Date Wed, 09 Dec 2015 10:18:25 GMT
Actually no events are missing. EventDrainAttemptCount is simply the number
of times the sink tried to read data from the channel, and SuccessCount is
the times it was successful. Nothing to worry about.


Thanks,
Hari

On Wed, Dec 9, 2015 at 1:00 PM, Rani Yaroshinski <rani.yaroshinski@gmail.com
> wrote:

> From the missing data sent by you, one can understand that the batch size
> set is 5000. Therefore, the answer is quite clear. There are 2 events
> missing to complete the batch. The 4988 events are waiting for the
> completionnof the batch in the channel temporary storage.
> On 9 Dec 2015 04:40, "Zhishan Li" <zhishanlee@gmail.com> wrote:
>
>> All,
>>
>> Dose the HDFS Sink loss data?
>>
>> Here is metrics from http monitor feature.
>>
>> "SINK.k1": {
>>   "BatchCompleteCount": "824",
>>   "ConnectionFailedCount": "0",
>>   "EventDrainAttemptCount": "4124988",
>>   "ConnectionCreatedCount": "28",
>>   "Type": "SINK",
>>   "BatchEmptyCount": "0",
>>   "ConnectionClosedCount": "20",
>>   "EventDrainSuccessCount": "4120000",
>>   "StopTime": "0",
>>   "StartTime": "1449490280602",
>>   "BatchUnderflowCount": "0"
>> },
>>
>>
>> *EventDrainAttemptCount: 4124988* is what HDFS sink received events.
>> *But EventDrainSuccessCount: 4120000 *is those events flushed to hdfs.
>> Right?
>>
>> Where is other events: 4988 ( =4124988 - 4120000) ?
>>
>> Thanks
>>
>>

Mime
View raw message