hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jesse Yates (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-1811) Snapshot HFile and region statistics at compaction time and make info available to clients
Date Mon, 12 May 2014 17:46:16 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995307#comment-13995307
] 

Jesse Yates commented on HBASE-1811:
------------------------------------

Yeah, 7958 looks like a dup (timeline wise). Looks like good ideas always come back around
:)

We could make 7958 be the scan-time infrastructure and this one to be adding comprehensive
stats?

> Snapshot HFile and region statistics at compaction time and make info available to clients
> ------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1811
>                 URL: https://issues.apache.org/jira/browse/HBASE-1811
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Priority: Minor
>
> Consider snapshotting HFile and region statistics at major and minor compaction time
and making the info available to clients:
> * Key statistics
>  ** cardinality
>  ** length avg/min/max/stdev
>  ** information content measure (entropy, etc.)
>  ** histogram
> etc.
> * Value statistics
>  ** length avg/min/max/stdev
>  ** information content measure (entropy, etc.)
>  ** histogram
> etc.
> * Region statistics
>  ** density estimation
>  ** KV count
>  ** total storage size (on disk)
>  ** total storage size (uncompressed)
> etc. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message