Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id DED9A200C5A for ; Tue, 4 Apr 2017 01:55:46 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id DD6C2160B9C; Mon, 3 Apr 2017 23:55:46 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 21A65160BA4 for ; Tue, 4 Apr 2017 01:55:45 +0200 (CEST) Received: (qmail 21871 invoked by uid 500); 3 Apr 2017 23:55:45 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 21741 invoked by uid 99); 3 Apr 2017 23:55:45 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Apr 2017 23:55:45 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id B3CE6CA96E for ; Mon, 3 Apr 2017 23:55:44 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id mEU5CVO89B0S for ; Mon, 3 Apr 2017 23:55:44 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id C0AED5FC72 for ; Mon, 3 Apr 2017 23:55:43 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id E1993E0D2E for ; Mon, 3 Apr 2017 23:55:42 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id D793824026 for ; Mon, 3 Apr 2017 23:55:41 +0000 (UTC) Date: Mon, 3 Apr 2017 23:55:41 +0000 (UTC) From: "Haibo Chen (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (YARN-6382) Address race condition on TimelineWriter.flush() caused by buffer-sized flush MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 03 Apr 2017 23:55:47 -0000 [ https://issues.apache.org/jira/browse/YARN-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15954358#comment-15954358 ] Haibo Chen edited comment on YARN-6382 at 4/3/17 11:55 PM: ----------------------------------------------------------- Thanks for the nice summary [~jrottinghuis]! bq. This write causes the buffer to be full, or perhaps thread B calls flush, or a timer calls flush. The latter two cases have been fixed by YARN-6357, so we only need to concern ourselves with the case where the buffer to be full. I believe, what I was mostly concerned about, losing data due to intermittent connection issues and this race condition, is only an issue if there is no spooling support. Assuming most data/entities are not problematic, that is, a flush will not fail because of the data itself and subsequent retries will eventually write the data successfully in HBase, we can provide enough guarantee that good entities are all going to be eventually persisted in HBase. Given that most of what b) solves will go away when we have the spooling writer, I agree that we could just document the issue for now. Once we get the spooling writer, we can come back and revisit this to address what we want to do with malformed/problematic entities if they failed to be persisted. was (Author: haibochen): Thanks for the nice summary [~jrottinghuis]! bq. This write causes the buffer to be full, or perhaps thread B calls flush, or a timer calls flush. The latter two cases have been fixed by YARN-6357, so we only need to concern ourselves with the case where the buffer to be full. I believe, what I was mostly concerned about, losing data due to intermittent connection issues and this race condition, is only an issue if there is no spooling support. Assuming most data/entities are not problematic, that is, a flush will not fail because of the data itself and subsequent retries will eventually write the data successfully in HBase, we can provide enough guarantee that good entities are all going to be eventually persisted in HBase. Given that most of what b) solves will go away when we have the spooling writer, I agree that we could just document the issue for now. Once we get the spooling writer, we can come back and revisit this to address what we want to do with malformed/problematic entities. > Address race condition on TimelineWriter.flush() caused by buffer-sized flush > ----------------------------------------------------------------------------- > > Key: YARN-6382 > URL: https://issues.apache.org/jira/browse/YARN-6382 > Project: Hadoop YARN > Issue Type: Sub-task > Affects Versions: 3.0.0-alpha2 > Reporter: Haibo Chen > Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > > YARN-6376 fixes the race condition between putEntities() and periodical flush() by WriterFlushThread in TimelineCollectorManager, or between putEntities() in different threads. > However, BufferedMutator can have internal size-based flush as well. We need to address the resulting race condition. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: yarn-issues-help@hadoop.apache.org