hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haibo Chen (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-6382) Address race condition on TimelineWriter.flush() caused by buffer-sized flush
Date Mon, 03 Apr 2017 23:55:41 GMT

    [ https://issues.apache.org/jira/browse/YARN-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15954358#comment-15954358
] 

Haibo Chen edited comment on YARN-6382 at 4/3/17 11:55 PM:
-----------------------------------------------------------

Thanks for the nice summary [~jrottinghuis]! 
bq. This write causes the buffer to be full, or perhaps thread B calls flush, or a timer calls
flush.
The latter two cases have been fixed by YARN-6357, so we only need to concern ourselves with
the case where the buffer to be full.

I believe, what I was mostly concerned about, losing data due to intermittent connection issues
and this race condition, is only an issue if there is no spooling support. 
Assuming most data/entities are not problematic, that is, a flush will not fail because of
the data itself and subsequent retries will eventually write the data successfully in HBase,
we can provide enough guarantee that good entities are all going to be eventually persisted
in HBase. 
Given that most of what b) solves will go away when we have the spooling writer, I agree that
we could just document the issue for now. Once we get the spooling writer, we can come back
and revisit this to address what we want to do with malformed/problematic entities if they
failed to be persisted.


was (Author: haibochen):
Thanks for the nice summary [~jrottinghuis]! 
bq. This write causes the buffer to be full, or perhaps thread B calls flush, or a timer calls
flush.
The latter two cases have been fixed by YARN-6357, so we only need to concern ourselves with
the case where the buffer to be full.

I believe, what I was mostly concerned about, losing data due to intermittent connection issues
and this race condition, is only an issue if there is no spooling support. 
Assuming most data/entities are not problematic, that is, a flush will not fail because of
the data itself and subsequent retries will eventually write the data successfully in HBase,
we can provide enough guarantee that good entities are all going to be eventually persisted
in HBase. 
Given that most of what b) solves will go away when we have the spooling writer, I agree that
we could just document the issue for now. Once we get the spooling writer, we can come back
and revisit this to address what we want to do with malformed/problematic entities.

> Address race condition on TimelineWriter.flush() caused by buffer-sized flush
> -----------------------------------------------------------------------------
>
>                 Key: YARN-6382
>                 URL: https://issues.apache.org/jira/browse/YARN-6382
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>    Affects Versions: 3.0.0-alpha2
>            Reporter: Haibo Chen
>            Assignee: Haibo Chen
>              Labels: yarn-5355-merge-blocker
>
> YARN-6376 fixes the race condition between putEntities() and periodical flush() by WriterFlushThread
in TimelineCollectorManager, or between putEntities() in different threads.
> However, BufferedMutator can have internal size-based flush as well. We need to address
the resulting race condition.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message