hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joep Rottinghuis (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader
Date Fri, 28 Aug 2015 22:05:46 GMT

    [ https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720667#comment-14720667

Joep Rottinghuis commented on YARN-3862:

If we need to retrieve exactly known columns (and in addition we know if it is a metric, or
a config value etc) then we can add these to the scan (or get) directly through
addColumn(byte [] family, byte [] qualifier)

For ColumnPrefixFilter is also clear. That is just restricting which rows are returned (it
filters the keys).
The confusion starts with org.apache.hadoop.hbase.filter.QualifierFilter. That can be used
to retrieve only some columns, specifically when combined with a WhileMatchFilter.

In addition we have the consideration whether we want to push these limits down to HBase (which
is preferable) or whether we want to just pull back everything from HBase and restrict what
we serialize in the result.

I think it would be cleaner to have a direct separate API (method argument) to be able to
specify which columns to retrieve. If we then add specific values to the scan, or prefix patterns
to a filter is up to the implementation.

> Decide which contents to retrieve and send back in response in TimelineReader
> -----------------------------------------------------------------------------
>                 Key: YARN-3862
>                 URL: https://issues.apache.org/jira/browse/YARN-3862
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Varun Saxena
>            Assignee: Varun Saxena
>         Attachments: YARN-3862-YARN-2928.wip.01.patch
> Currently, we will retrieve all the contents of the field if that field is specified
in the query API. In case of configs and metrics, this can become a lot of data even though
the user doesn't need it. So we need to provide a way to query only a set of configs or metrics.
> As a comma spearated list of configs/metrics to be returned will be quite cumbersome
to specify, we have to support either of the following options :
> # Prefix match
> # Regex
> # Group the configs/metrics and query that group.
> We also need a facility to specify a metric time window to return metrics in a that window.
This may be useful in plotting graphs 

This message was sent by Atlassian JIRA

View raw message