hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Varun Saxena (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3863) Support complex filters in TimelineReader
Date Thu, 03 Mar 2016 09:00:26 GMT

    [ https://issues.apache.org/jira/browse/YARN-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15177509#comment-15177509
] 

Varun Saxena commented on YARN-3863:
------------------------------------

Thanks [~sjlee0] for the review.

bq. One high level question: am I correct in understanding that if a relations filter is specified
for example but relation was not specified as part of fields to retrieve, we would try to
fetch the relation?
Yes, we would try to fetch only those relations which are required to match the relation filters.
Same goes for event filters. We will try to fetch only those events which are required to
match event filters if fields to retrieve does not specify EVENTS.

bq. What if we simply reject or ignore the filters if they do not match the fields to retrieve?
Would it make the implementation simpler or harder?
It will preclude the need of some of the code in GenericEntityReader and ApplicationEntity
i.e. primarily code in method {{fetchPartialColsFromInfoFamily}} and {{createFilterListForColsOfInfoFamily}}.

bq. To me, supporting more contents even if the filters and the fields to retrieve are not
consistent seems very much optional, and I'm not sure if it is worth it especially if it adds
a lot more complexity. What do you think?
Personally I think fields to retrieve and filters should be treated separately. Filters decide
which entities to carry back in response and fields/configs/metrics to retrieve decide what
should be carried in each entity.
Treating filters and fields to retrieve is consistent with code written previously in the
branch but as this is new code we can change the behavior too. But I am not very sure if we
should do so.
For instance, if I want to get IDs' of all the FINISHED apps, I can make a query with eventfilters
as APPLICATION_FINISHED and not specify anything in fields to retrieve as I am only interested
in application ID. If I link it to fields to retrieve, I will have to unnecessarily fetch
other events as well, which I have no interest in. This increases the amount of bytes transferred
across the wire as well. Moreover, info also has associated info as well. 
Maybe along the lines of confs/metrics to retrieve we can have something like events to retrieve
as well but in all these cases one query param is depending on other which doesn't sound right
to me.
Thoughts ?
We can discuss further on this in today's meeting.

bq. I know Vrushali C had some thoughts on how to split this monolithic TestHBaseTimelineStorage.
It might be good to come to a consensus on how to split it...
Ok. I had split it across apps and entities. We can seek her opinion too on this in today's
meeting.

I will check other comments when I start coding for next version of patch. Most sound like
they would be valid and fixable.

> Support complex filters in TimelineReader
> -----------------------------------------
>
>                 Key: YARN-3863
>                 URL: https://issues.apache.org/jira/browse/YARN-3863
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>    Affects Versions: YARN-2928
>            Reporter: Varun Saxena
>            Assignee: Varun Saxena
>              Labels: yarn-2928-1st-milestone
>         Attachments: YARN-3863-YARN-2928.v2.01.patch, YARN-3863-YARN-2928.v2.02.patch,
YARN-3863-YARN-2928.v2.03.patch, YARN-3863-feature-YARN-2928.wip.003.patch, YARN-3863-feature-YARN-2928.wip.01.patch,
YARN-3863-feature-YARN-2928.wip.02.patch, YARN-3863-feature-YARN-2928.wip.04.patch, YARN-3863-feature-YARN-2928.wip.05.patch
>
>
> Currently filters in timeline reader will return an entity only if all the filter conditions
hold true i.e. only AND operation is supported. We can support OR operation for the filters
as well. Additionally as primary backend implementation is HBase, we can design our filters
in a manner, where they closely resemble HBase Filters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message