geode-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacob Barrett <jbarr...@pivotal.io>
Subject Re: Should Geode stats conform to backwards compatibility constraints?
Date Thu, 14 Feb 2019 00:15:07 GMT
I don’t consider the stats API a public API. I think it is a leaked internal API. Do we document
the interactions with this API? Do we document the stat names? I consider documentation the
API. I can discover all sorts of exposed methods in a library but shouldn’t consider them
contracted API.

If however we come to a conclusion that they are API then we could work around the issues
below. We could changes the stat API to be a little forgiving on the getInt and related calls
if the internal storage is a long. We simply downcast the long to an int. The behavior to
the end user does not change but if they use the getLong they get the new improved stat. For
the JMX we just add a new method, getTotalNetSearchCompletedLong() to get the long version
of the stat.

-Jake

> On Feb 13, 2019, at 3:58 PM, Kirk Lund <klund@apache.org> wrote:
> 
> Quite a few Geode stats are currently defined as IntCounters which very
> easily wrap past Integer.MAX_VALUE. Some users wanted these stats to not
> wrap to negative MAX_VALUE, so my team defined the following two tickets
> and changed them to LongCounters on the develop branch:
> 
> GEODE-6345: StatSamplerStats jvmPauses stat may wrap to negative value
> https://issues.apache.org/jira/browse/GEODE-6345
> 
> GEODE-6334: CachePerfStats operation count stats may wrap to negative values
> https://issues.apache.org/jira/browse/GEODE-6334
> 
> Some people are now concerned that this may break backwards compatibility
> and API for existing users.
> 
> There are two potential ways for a user to *experience* this as an API
> change:
> 
> 1) MemberMXBean in JMX
> 
> *-  int getTotalNetSearchCompleted();*
> *+  long getTotalNetSearchCompleted();*
> 
> As you can see in GEODE-6334, we changed quite a few stats from integer to
> long in CachePerfStats. The only one directly exposed as an attribute on
> MemberMXBean is TotalNetSearchCompleted.
> 
> Users will find that this API method changed.
> 
> 2) Statistics API with undocumented strings
> 
> If we assume that users might know or discover that we have a statistics
> text id of "cachePerfStats" with an integer stat of name "puts" then they
> could use our Statistics API to get the value:
> 
> * 1:    CacheFactory cacheFactory = new CacheFactory();*
> * 2:    Cache cache = cacheFactory.create();*
> * 3:*
> * 4:    Region<String, String> region = cache.<String,
> String>createRegionFactory(PARTITION).create("region");*
> * 5:*
> * 6:    Statistics[] statistics =
> cache.getDistributedSystem().findStatisticsByTextId("cachePerfStats");*
> * 7:    int oldPuts = statistics[0].getInt("puts");*
> * 8:*
> * 9:    region.put("some", "value");*
> *10:*
> *11:    int newPuts = statistics[0].getInt("puts");*
> *12:*
> *13:    assertThat(newPuts).isEqualTo(oldPuts + 1);*
> 
> The above works in Geode 1.8 and earlier but will begin to throw the
> following in Geode 1.9 (when develop branch is released):
> 
> *java.lang.IllegalArgumentException: The statistic puts with id 23 is of
> type long and it was expected to be an int.*
> * at
> org.apache.geode.internal.statistics.StatisticDescriptorImpl.checkInt(StatisticDescriptorImpl.java:324)*
> * at
> org.apache.geode.internal.statistics.StatisticsImpl.getIntId(StatisticsImpl.java:577)*
> * at
> org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:274)*
> * at
> org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:269)*
> * at com.user.example.Example.example(Example.java7)*
> 
> In order to avoid the above exception, a user would have to change the code
> on lines 7 and 11 to use *getLong* instead of *getInt*.
> 
> Should Geode stats be considered a form of API contract and should they
> conform to backwards compatibility constraints?
> 
> Should we change these from LongCounters back to IntCounters?
> 
> Someone did suggest that we could change them back to IntCounters and then
> change our statistics internals to skip to zero anytime a stat attempts to
> increment above MAX_VALUE, however, I think that if we decide that stats
> cannot be changed from IntCounters to LongCounters then we should not make
> this change either.
> 
> Thanks for any input or opinions,
> Kirk


Mime
View raw message