hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Junping Du (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3949) ensure timely flush of timeline writes
Date Fri, 24 Jul 2015 17:48:05 GMT

    [ https://issues.apache.org/jira/browse/YARN-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14640799#comment-14640799

Junping Du commented on YARN-3949:

Thanks for your input, [~jrottinghuis]! 
I agree the current API (write() + flush()) is simple and flexible to use. However, I was
just thinking another more easy-to-use way for API could be something like write_through()
or write_back() (or write_sync or write_async). The writer provide different semantics to
caller and caller doesn't have to know about details when and how to flush(). So this just
sounds like a classic trade-off between flexible and simple. To be clear, I am not against
the current design but just want to propose another way in case we have other callers (may
not inherited from TimelineCollectorManager) in future. I am fine with delaying the refactor
work by that time. 
About the latest patch, mostly looks good. Except we should document new configuration "YarnConfiguration.TIMELINE_SERVICE_WRITER_FLUSH_INTERVAL_SECONDS"
to yarn-default.xml and put description there. Also, the default value for this new configuration
should be put to YarnConfiguration to conform with current code conventions.

> ensure timely flush of timeline writes
> --------------------------------------
>                 Key: YARN-3949
>                 URL: https://issues.apache.org/jira/browse/YARN-3949
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Sangjin Lee
>            Assignee: Sangjin Lee
>         Attachments: YARN-3949-YARN-2928.001.patch, YARN-3949-YARN-2928.002.patch, YARN-3949-YARN-2928.002.patch
> Currently flushing of timeline writes is not really handled. For example, {{HBaseTimelineWriterImpl}}
relies on HBase's {{BufferedMutator}} to batch and write puts asynchronously. However, {{BufferedMutator}}
may not flush them to HBase unless the internal buffer fills up.
> We do need a flush functionality first to ensure that data are written in a reasonably
timely manner, and to be able to ensure some critical writes are done synchronously (e.g.
key lifecycle events).

This message was sent by Atlassian JIRA

View raw message