hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haibo Chen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-6357) Implement TimelineCollector#putEntitiesAsync
Date Wed, 22 Mar 2017 21:01:41 GMT

    [ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15937110#comment-15937110
] 

Haibo Chen commented on YARN-6357:
----------------------------------

Thanks for the pointer [~varun_impala_149e].  Reading through the discussion there, it now
makes sense to me why a TimelineWriter.writesync() is never added (https://issues.apache.org/jira/browse/YARN-3949?focusedCommentId=14640959&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14640959).
However, I think there is still some confusion in the current TimelienWriter API. Given the
way TimelineCollector uses TimelineWriter, i.e. call flush() right after write() for synchronous
putEntities requests, TimelineCollector expects TimelineWriter.write() to be asynchronous.
In the meantime, it also expects a TimelineResponse from this asynchronous method, which I
think is awkward and can cause confusion for alternative TimelineWriter implementations. 

I propose, we replace TimelineResponse with void as the return type, effectively making TimelineWriter
API very much similar to that of BufferedMutator. This way, for truly asynchronous implementations
of TimelineWriter.write(), such as HBaseTimelineWriter, no bogus TimelineResponse is created
any more. For synchronous implementations, response can be given back in terms of IOException,
that is, if the underlying synchronous store succeeds, no exception is thrown, otherwise,
the failure is wrapped in a IOException and given back to TimelineCollector.

> Implement TimelineCollector#putEntitiesAsync
> --------------------------------------------
>
>                 Key: YARN-6357
>                 URL: https://issues.apache.org/jira/browse/YARN-6357
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: ATSv2, timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Joep Rottinghuis
>            Assignee: Haibo Chen
>              Labels: yarn-5355-merge-blocker
>         Attachments: YARN-6357.01.patch, YARN-6357.02.patch
>
>
> As discovered and discussed in YARN-5269 the TimelineCollector#putEntitiesAsync method
is currently not implemented and TimelineCollector#putEntities is asynchronous.
> TimelineV2ClientImpl#putEntities vs TimelineV2ClientImpl#putEntitiesAsync correctly call
TimelineEntityDispatcher#dispatchEntities(boolean sync,... with the correct argument. This
argument does seem to make it into the params, and on the server side TimelineCollectorWebService#putEntities
correctly pulls the async parameter from the rest call. See line 156:
> {code}
>     boolean isAsync = async != null && async.trim().equalsIgnoreCase("true");
> {code}
> However, this is where the problem starts. It simply calls TimelineCollector#putEntities
and ignores the value of isAsync. It should instead have called TimelineCollector#putEntitiesAsync,
which is currently not implemented.
> putEntities should call putEntitiesAsync and then after that call writer.flush()
> The fact that we flush on close and we flush periodically should be more of a concern
of avoiding data loss; close in case sync is never called and the periodic flush to guard
against having data from slow writers get buffered for a long time and expose us to risk of
loss in case the collector crashes with data in its buffers. Size-based flush is a different
concern to avoid blowing up memory footprint.
> The spooling behavior is also somewhat separate.
> We have two separate methods on our API putEntities and putEntitiesAsync and they should
have different behavior beyond waiting for the request to be sent.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message