hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Varun Saxena (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3863) Support complex filters in TimelineReader
Date Wed, 24 Feb 2016 12:42:18 GMT

    [ https://issues.apache.org/jira/browse/YARN-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15162920#comment-15162920

Varun Saxena commented on YARN-3863:

Thanks [~sjlee0] for the comments.

bq. If I'm reading this right, the key changes seem to be in TimelineStorageUtils
The changes in TimelineStorageUtils would primarily be used by FS implementation. Because
in FS Impl, filters will be applied locally.
The major change from a HBase implementation perspective is xxxEntityReader classes where
we are creating a filter list based on filters.
However for relation filters and event filters, we cannot create a HBase filter to filter
out rows because of the way relations and events are stored. So the logic for relations and
filters is to fetch only the required columns(as required by the filters) if those fields
are not to be retrieved.
I am basically trying to trim down data brought over from backend.
For relations and events, filters are then applied locally(even for HBase storage implementation).
For other filters, in HBase implementation, we no longer apply filters locally and its all
handled through HBase filters.
Sorry for missing out on adding detailed comments in TimelineStorageUtils. I agree code can
be refactored there to make it more readable.

bq. Also, these methods seem to have similar code. Any possibility of refactoring the common
Yes, code is similar. We are looping over a filter list and then checking the operator while
doing processing for an individual filter.
I thought about it but then the issue in moving it into a common area is that the data structures
which hold events, configs, metrics,etc. are not same. 

We can however do one thing and that is to pass the TimelineEntity object itself into a common
function(for all filters) and also pass something, say an enum indicating what kind of filter
we are intending to match(name it as something like TimelineEntityFiltersType). Then based
on this enum value, get the appropriate item(configs, metrics,etc.) from the passed entity.
This way we can move common logic to a specific method which can in turn call the appropriate
method to process based on filter type(say equality filter, multivalue equality filter, etc.).
Does this sound fine ?

> Support complex filters in TimelineReader
> -----------------------------------------
>                 Key: YARN-3863
>                 URL: https://issues.apache.org/jira/browse/YARN-3863
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>    Affects Versions: YARN-2928
>            Reporter: Varun Saxena
>            Assignee: Varun Saxena
>              Labels: yarn-2928-1st-milestone
>         Attachments: YARN-3863-YARN-2928.v2.01.patch, YARN-3863-YARN-2928.v2.02.patch,
YARN-3863-feature-YARN-2928.wip.003.patch, YARN-3863-feature-YARN-2928.wip.01.patch, YARN-3863-feature-YARN-2928.wip.02.patch,
YARN-3863-feature-YARN-2928.wip.04.patch, YARN-3863-feature-YARN-2928.wip.05.patch
> Currently filters in timeline reader will return an entity only if all the filter conditions
hold true i.e. only AND operation is supported. We can support OR operation for the filters
as well. Additionally as primary backend implementation is HBase, we can design our filters
in a manner, where they closely resemble HBase Filters.

This message was sent by Atlassian JIRA

View raw message