hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gunnar Tapper (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-12364) API for query metrics
Date Wed, 12 Nov 2014 19:40:34 GMT

    [ https://issues.apache.org/jira/browse/HBASE-12364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208524#comment-14208524

Gunnar Tapper commented on HBASE-12364:

I don't know what CP means?

Thank you,


Download a free version of HPDSM, a unified big-data administration tool for Vertica and Hadoop
at: HP DSM Download

“People don’t know what they want until you show it to them… Our task is to read things
that are not yet on the page.” — Steve Jobs

> API for query metrics
> ---------------------
>                 Key: HBASE-12364
>                 URL: https://issues.apache.org/jira/browse/HBASE-12364
>             Project: HBase
>          Issue Type: Improvement
>          Components: metrics
>    Affects Versions:
>         Environment: Any Hadoop distribution.
>            Reporter: Gunnar Tapper
> Request based on a discussion with Nick Dimiduk at Strata.
> Background: IT organizations operate on reports based on metrics. They look for comparative
statistics such as number of queries per user per day per connection. Further, troubleshooting
is often based on questions such as "what queries are each running, which user is using the
most resources, and what query among the queries that a specific user need to be tuned?"
> Currently, the slow-query log does not provide the instrumentation needed for management
applications to obtain and analyze this level of information. In itself, they slow-query doesn't
contain important information that allows mapping of the information to user, connection,
application, and so on.
> Further, the slow-query log doesn't log each get and scan, which means that it's not
possible to see all queries that have been run against the HBase database.
> Preferably, a REST API is provided to obtain the required information, which should be
extended so that each query can be mapped to environment information; for example:
> • Account
> • Account string
> • Client IP address
> • Query text
> • Session ID
> • Date and time
> • User name
> Start, Status, and Completion records should be provided so that it's possible to determine
the progress and outcome of any given query. Further, status and completion information should
contain information about resource usage; for example:
> • CPU time
> • Memory 
> • I/Os
> • Rows read/written
> • Objects accessed
> • Wait times
> Preferably, counters are provided in both cumulative and delta (since query start) formats.

This message was sent by Atlassian JIRA

View raw message