hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "nicu marasoiu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4435) Add Group By functionality using Coprocessors
Date Fri, 07 Aug 2015 16:20:47 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662049#comment-14662049
] 

nicu marasoiu commented on HBASE-4435:
--------------------------------------

Hi,

Is this still ongoing? I looked on github and seemed that only one metric like sum(column)
is done, not multiple ones. The general case is of course group by (d1,..,dn) sum(c1) hyperlogUniq(c2)
i.e. multiple metrics.

Thank you,
Nicu

> Add Group By functionality using Coprocessors
> ---------------------------------------------
>
>                 Key: HBASE-4435
>                 URL: https://issues.apache.org/jira/browse/HBASE-4435
>             Project: HBase
>          Issue Type: Improvement
>          Components: Coprocessors
>            Reporter: Nichole Treadway
>            Priority: Minor
>              Labels: by, coprocessors, group, hbase
>         Attachments: HBASE-4435-v2.patch, HBase-4435.patch
>
>
> Adds in a Group By -like functionality to HBase, using the Coprocessor framework. 
> It provides the ability to group the result set on one or more columns (groupBy families).
It computes statistics (max, min, sum, count, sum of squares, number missing) for a second
column, called the stats column. 
> To use, I've provided two implementations.
> 1. In the first, you specify a single group-by column and a stats field:
>       statsMap = gbc.getStats(tableName, scan, groupByFamily, groupByQualifier, statsFamily,
statsQualifier, statsFieldColumnInterpreter);
> The result is a map with the Group By column value (as a String) to a GroupByStatsValues
object. The GroupByStatsValues object has max,min,sum etc. of the stats column for that group.
> 2. The second implementation allows you to specify a list of group-by columns and a stats
field. The List of group-by columns is expected to contain lists of {column family, qualifier}
pairs. 
>       statsMap = gbc.getStats(tableName, scan, listOfGroupByColumns, statsFamily, statsQualifier,
statsFieldColumnInterpreter);
> The GroupByStatsValues code is adapted from the Solr Stats component.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message