Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Date: Mon, 3 Apr 2017 23:55:41 +0000 (UTC)
From: "Haibo Chen (JIRA)" <jira@apache.org>
To: yarn-issues@hadoop.apache.org
Message-ID: <JIRA.13058712.1490297950000.196217.1491263741881@Atlassian.JIRA>
In-Reply-To: <JIRA.13058712.1490297950000@Atlassian.JIRA>
References: <JIRA.13058712.1490297950000@Atlassian.JIRA> <JIRA.13058712.1490297950458@jira-lw-us.apache.org>
Subject: [jira] [Comment Edited] (YARN-6382) Address race condition on
 TimelineWriter.flush() caused by buffer-sized flush
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Mon, 03 Apr 2017 23:55:47 -0000


    [ https://issues.apache.org/jira/browse/YARN-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15954358#comment-15954358 ] 

Haibo Chen edited comment on YARN-6382 at 4/3/17 11:55 PM:
-----------------------------------------------------------

Thanks for the nice summary [~jrottinghuis]! 
bq. This write causes the buffer to be full, or perhaps thread B calls flush, or a timer calls flush.
The latter two cases have been fixed by YARN-6357, so we only need to concern ourselves with the case where the buffer to be full.

I believe, what I was mostly concerned about, losing data due to intermittent connection issues and this race condition, is only an issue if there is no spooling support. 
Assuming most data/entities are not problematic, that is, a flush will not fail because of the data itself and subsequent retries will eventually write the data successfully in HBase, we can provide enough guarantee that good entities are all going to be eventually persisted in HBase. 
Given that most of what b) solves will go away when we have the spooling writer, I agree that we could just document the issue for now. Once we get the spooling writer, we can come back and revisit this to address what we want to do with malformed/problematic entities if they failed to be persisted.


was (Author: haibochen):
Thanks for the nice summary [~jrottinghuis]! 
bq. This write causes the buffer to be full, or perhaps thread B calls flush, or a timer calls flush.
The latter two cases have been fixed by YARN-6357, so we only need to concern ourselves with the case where the buffer to be full.

I believe, what I was mostly concerned about, losing data due to intermittent connection issues and this race condition, is only an issue if there is no spooling support. 
Assuming most data/entities are not problematic, that is, a flush will not fail because of the data itself and subsequent retries will eventually write the data successfully in HBase, we can provide enough guarantee that good entities are all going to be eventually persisted in HBase. 
Given that most of what b) solves will go away when we have the spooling writer, I agree that we could just document the issue for now. Once we get the spooling writer, we can come back and revisit this to address what we want to do with malformed/problematic entities.

> Address race condition on TimelineWriter.flush() caused by buffer-sized flush
> -----------------------------------------------------------------------------
>
>                 Key: YARN-6382
>                 URL: https://issues.apache.org/jira/browse/YARN-6382
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>    Affects Versions: 3.0.0-alpha2
>            Reporter: Haibo Chen
>            Assignee: Haibo Chen
>              Labels: yarn-5355-merge-blocker
>
> YARN-6376 fixes the race condition between putEntities() and periodical flush() by WriterFlushThread in TimelineCollectorManager, or between putEntities() in different threads.
> However, BufferedMutator can have internal size-based flush as well. We need to address the resulting race condition.


--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org