hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <la...@apache.org>
Subject Re: meaning for AgeOfLastAppliedOp in Replication MetricsSink
Date Wed, 30 Jul 2014 17:41:25 GMT
I was also confusing ageOfLastShippedEdit with ageOfLastAppliedEdit. I had fixed ageOfLastShippedEdit
on the source in 0.94.
Looks like 0.94 is doing the right thing with ageOfLastAppliedEdit, but 0.98+ is not.


-- Lars



________________________________
 From: lars hofhansl <larsh@apache.org>
To: Demai Ni <nidmgg@gmail.com>; "dev@hbase.apache.org" <dev@hbase.apache.org>

Sent: Wednesday, July 30, 2014 10:30 AM
Subject: Re: meaning for AgeOfLastAppliedOp in Replication MetricsSink
 

> When the  'the time an edit entered the system' doesn't change (in the case 
of  no Sink Op enter for a period of time), the age will keep growing 
since current time moving forward, which gives a false impression that 
an edit sitting in the queue for very long time. isn't it?  


I think until the item is shipped it should be counted as waiting. I.e. the time this reports
is the time between an edit entered the system and the time it finally gets shipped to the
replication sink.refreshAgeOfLastAppliedOp() should only be called when something is actually
being shipped, not periodically (I just fixed that in 0.94 HBASE-11143).


But I see you're looking at 0.98. There indeed we are called refreshAgeOfLastAppliedOp every
time we call getStats(), which would increase that metric even when there is nothing to ship.
That looks like a bug.

-- Lars



________________________________

From: Demai Ni <nidmgg@gmail.com>
To: "dev@hbase.apache.org" <dev@hbase.apache.org>; lars hofhansl <larsh@apache.org>

Sent: Wednesday, July 30, 2014 8:49 AM
Subject: Re: meaning for AgeOfLastAppliedOp in Replication MetricsSink



Lars, thanks for your input. 

 
This metric indicates the time an edit sat in the "replication queue" before it got replicated.
yeah, I am with you on this. 
 

With that definition it is doing the right thing: Reporting current time - the time an edit
entered the system (it's WAL time)
>
When the  'the time an edit entered the system' doesn't change (in the case of  no Sink
Op enter for a period of time), the age will keep growing since current time moving forward,
which gives a false impression that an edit sitting in the queue for very long time. isn't
it?  




On Tue, Jul 29, 2014 at 10:54 PM, lars hofhansl <larsh@apache.org> wrote:

This metric indicates the time an edit sat in the "replication queue" before it got replicated.
>With that definition it is doing the right thing: Reporting current time - the time an
edit entered the system (it's WAL time)
>
>
>-- Lars
>
>
>
>________________________________
> From: Demai Ni <nidmgg@gmail.com>
>To: "dev@hbase.apache.org" <dev@hbase.apache.org>
>Sent: Tuesday, July 29, 2014 3:48 PM
>Subject: meaning for AgeOfLastAppliedOp in Replication MetricsSink
>
>
>
>hi,
>
>A quick question to clarify this  AgeOfLastAppliedOp in MetricsSink.java. I
>assume it is used as an indicator about how long for a Sink OP to be
>applied; but instead, it is more like to show how long since the last Sink
>OP applied
>
>  /**
>   * Set the age of the last applied operation
>   *
>   * @param timestamp The timestamp of the last operation applied.
>   * @return the age that was set
>   */
>  public long setAgeOfLastAppliedOp(long timestamp) {
>    lastTimestampForAge = timestamp;
>    long age = System.currentTimeMillis() - lastTimestampForAge;
>    rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age);
>    return age;
>  }
>
>In the following scenario:
>1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is
>set for example 100ms;
>2) and then NO new Sink op occur.
>3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of
>return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, which
>doesn't make sense, right?
>
>should we put a check for (lastTimestampForAge != timestamp) before refresh
>the age?
>
>Thanks
>
>Demai
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message