hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhijie Shen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3551) Consolidate data model change according to the backend implementation
Date Wed, 29 Apr 2015 03:26:06 GMT

    [ https://issues.apache.org/jira/browse/YARN-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518654#comment-14518654

Zhijie Shen commented on YARN-3551:

bq. Also, think along with singleData, we also need a timestamp member variable, else we would
not know which timestamp this singleData value belongs to.

If we want to know the timestamp of the value, what's difference between single data and time
series? Or can I rephrase your proposal in another way: there exists two modes of a metric:

1. We remember all history values of one metric.
2. We only remember the last value of one metric.

So at the client for the first one, we use the data model api to buffer multiple data points
of time series, while for the second one, we keep update the single point with the latest
value. Once it is handed over to the server, for the first one, the server will persisted
all the points in HBase by using version, while for the second one, the server will only track
the latest version of the point.

bq. How about making the timeline metric generic?

I'm +0 about using generic type:

\+ It's good to restrict one metric just contains one type of value.
\- The caller will only know the object is TimelineMetric<?> but won't know what the
"?" is before checking the object itself. It's no easier than inspecting the data directly.
\- It may not be complete consistent with the way jackson marshal and unmarshal the JSON object.
For example, if we send a TimelineMetric<Long>(1L, 2L, 3L, 4L), we will actually get
TimelineMetric<Integer>(1, 2, 3, 4) according jasckon's rule of ser/des.

bq. Hmm, yes, we do need a datatype per metric to be stored, this will have to be considered
in the writer impls.

Please take a look at my comment on [YARN-3411|https://issues.apache.org/jira/browse/YARN-3411?focusedCommentId=14517838&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14517838].
Once we receive the data at the server side, we know what kind of number the metric is. To
ser/des with the backend, we can make use of GenericObjectMapper. 

> Consolidate data model change according to the backend implementation
> ---------------------------------------------------------------------
>                 Key: YARN-3551
>                 URL: https://issues.apache.org/jira/browse/YARN-3551
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Zhijie Shen
>            Assignee: Zhijie Shen
>         Attachments: YARN-3551.1.patch, YARN-3551.2.patch, YARN-3551.3.patch
> Based on the comments on [YARN-3134|https://issues.apache.org/jira/browse/YARN-3134?focusedCommentId=14512080&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14512080]
and [YARN-3411|https://issues.apache.org/jira/browse/YARN-3411?focusedCommentId=14512098&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14512098],
we need to change the data model to restrict the data type of info/config/metric section.
> 1. Info: the value could be all kinds object that is able to be serialized/deserialized
by jackson.
> 2. Config: the value will always be assumed as String.
> 3. Metric: single data or time series value have to be number for aggregation.
> Other than that, info/start time/finish time of metric seem not to be necessary for storage.
They should be removed.

This message was sent by Atlassian JIRA

View raw message