From mapreduce-issues-return-91190-archive-asf-public=cust-asf.ponee.io@hadoop.apache.org Wed Feb 21 01:50:04 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id B91DB180654 for ; Wed, 21 Feb 2018 01:50:03 +0100 (CET) Received: (qmail 37011 invoked by uid 500); 21 Feb 2018 00:50:02 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 36998 invoked by uid 99); 21 Feb 2018 00:50:02 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Feb 2018 00:50:02 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id E3510C195F for ; Wed, 21 Feb 2018 00:50:01 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -110.311 X-Spam-Level: X-Spam-Status: No, score=-110.311 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id d4n75PNIGvO7 for ; Wed, 21 Feb 2018 00:50:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 06A215F1F3 for ; Wed, 21 Feb 2018 00:50:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 7FE08E0153 for ; Wed, 21 Feb 2018 00:50:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 191A221E5C for ; Wed, 21 Feb 2018 00:50:00 +0000 (UTC) Date: Wed, 21 Feb 2018 00:50:00 +0000 (UTC) From: "BELUGA BEHR (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (MAPREDUCE-7057) MergeThread Review MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAPREDUCE-7057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated MAPREDUCE-7057: ----------------------------------- Status: Patch Available (was: Open) > MergeThread Review > ------------------ > > Key: MAPREDUCE-7057 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7057 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv2 > Affects Versions: 3.0.0 > Reporter: BELUGA BEHR > Priority: Minor > Attachments: MAPREDUCE-7057.1.patch, MAPREDUCE-7057.2.patch > > > Source: > [MergeThread.java|https://github.com/apache/hadoop/blob/178751ed8c9d47038acf8616c226f1f52e884feb/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeThread.java] > Update this class to use Java 1.8 concurrent package. There also some corner-cases not being addressed with the current implementation: > {code:java|title=MergeThread.java} > // There is a scenario here where N threads have submitted inputs and are all waiting for the 'pendingToBeMerged' object. At this point, imagine the 'close' method is called. The close method will run, see nothing in the queue, interrupt the processing thread, and cause it to exit. Afterwards, the 'startMerge' threads will all be triggered and add the inputs to a queue for which there is no consumer. At this point, the T items have been removed from the inputs with no way to recover them. In practice, this may not ever be the case, but it can be tightened up. > public void startMerge(Set inputs) { > if (!closed) { > numPending.incrementAndGet(); > List toMergeInputs = new ArrayList(); > Iterator iter=inputs.iterator(); > for (int ctr = 0; iter.hasNext() && ctr < mergeFactor; ++ctr) { > toMergeInputs.add(iter.next()); > iter.remove(); > } > LOG.info(getName() + ": Starting merge with " + toMergeInputs.size() + > " segments, while ignoring " + inputs.size() + " segments"); > synchronized(pendingToBeMerged) { > pendingToBeMerged.addLast(toMergeInputs); > pendingToBeMerged.notifyAll(); > } > } > } > public synchronized void close() throws InterruptedException { > closed = true; > waitForMerge(); > interrupt(); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org