hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Lawlor (JIRA)" <j...@apache.org>
Subject [jira] [Reopened] (HBASE-5980) Scanner responses from RS should include metrics on rows/KVs filtered
Date Tue, 14 Apr 2015 19:58:59 GMT

     [ https://issues.apache.org/jira/browse/HBASE-5980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Jonathan Lawlor reopened HBASE-5980:

This one was recently closed due to inactivity but caught my eye because it sounds like a
nice one to have. Currently we track some client side metrics during scans such as count of
regions scanned, count of RPCs, etc... (full list available in ScanMetrics class). However,
these client side metrics do not include information regarding events that have occurred server
side (like how many kv's have been filtered). 

If we wanted to have these metrics available client side, I believe it could be achieved in
the following manner:
1. Define a new class to encapsulate the server side metrics that we wish to access/track
client side
2. Define a new protobuf message type for this new metrics class
3. Add the metrics as another field in the ScanResponse
4. Add new fields to ScanMetrics (the class that already exists client side) corresponding
to the server side metrics and update these metrics after each RPC response in ScannerCallable

In terms of how to actually track these metrics during Scan RPC's, we can add an instance
of this new server side metrics class to the ScannerContext class that was added in HBASE-13421.
Then all metric tracking could be performed via ScannerContext#getMetrics()#update...

Any thoughts/comments?

> Scanner responses from RS should include metrics on rows/KVs filtered
> ---------------------------------------------------------------------
>                 Key: HBASE-5980
>                 URL: https://issues.apache.org/jira/browse/HBASE-5980
>             Project: HBase
>          Issue Type: Improvement
>          Components: Client, metrics, regionserver
>    Affects Versions: 0.95.2
>            Reporter: Todd Lipcon
>            Priority: Minor
> Currently it's difficult to know, when issuing a filter, what percentage of rows were
skipped by that filter. We should expose some basic counters back to the client scanner object.
For example:
> - number of rows filtered by row key alone (filterRowKey())
> - number of times each filter response was returned by filterKeyValue() - corresponding
to Filter.ReturnCode
> What would be slickest is if this could actually return a tree of counters for cases
where FilterList or other combining filters are used. But a top-level is a good start.

This message was sent by Atlassian JIRA

View raw message