flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Pompermaier <pomperma...@okkam.it>
Subject Re: InputFormat API and current scanned row count
Date Tue, 02 Dec 2014 21:08:40 GMT
In my specific use case I was intererested in understanding why the scans
of the splits were taking a long time, so I was intrested in getting
statistics about the number of records contained in each split and the
rate/speed of its reading..do you think it could be something useful in
general?
On Dec 2, 2014 9:56 PM, "Fabian Hueske" <fhueske@apache.org> wrote:

> Hi Flavio,
>
> we have a few recently started efforts to implement the collection of
> monitoring and runtime/data statistics.
> Counting the number of elements emitted by an operator (or data source)
> will be included.
>
> Do you want to count the number of produced tuples for monitoring the
> progress or do you see a different use case?
>
> 2014-11-28 9:37 GMT+01:00 Flavio Pompermaier <pompermaier@okkam.it>:
>
> > Hi guys,
> >
> > I was debugging an inputFormat and I discovered that there's no way to
> > understand how many records have been processed in a split.
> > So I added a counter in my input format incremented every nextRecord..do
> > you think adding something to similar like "public int
> > getProcessedRecordsCount()" to InputFormat interface could be useful?
> > Or are you going to manage this count stat from the caller of nextRecord?
> >
> > Best,
> > Flavio
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message