From dev-return-1871-archive-asf-public=cust-asf.ponee.io@mxnet.incubator.apache.org Fri Jan 12 04:36:09 2018 Return-Path: X-Original-To: archive-asf-public@eu.ponee.io Delivered-To: archive-asf-public@eu.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by mx-eu-01.ponee.io (Postfix) with ESMTP id 95EBF180656 for ; Fri, 12 Jan 2018 04:36:09 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 8566C160C41; Fri, 12 Jan 2018 03:36:09 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A60A0160C13 for ; Fri, 12 Jan 2018 04:36:08 +0100 (CET) Received: (qmail 38057 invoked by uid 500); 12 Jan 2018 03:36:06 -0000 Mailing-List: contact dev-help@mxnet.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mxnet.incubator.apache.org Delivered-To: mailing list dev@mxnet.incubator.apache.org Received: (qmail 38045 invoked by uid 99); 12 Jan 2018 03:36:06 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Jan 2018 03:36:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 0A73EC3041 for ; Fri, 12 Jan 2018 03:36:06 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.129 X-Spam-Level: ** X-Spam-Status: No, score=2.129 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id B00sQROrGhA3 for ; Fri, 12 Jan 2018 03:36:04 +0000 (UTC) Received: from mail-it0-f46.google.com (mail-it0-f46.google.com [209.85.214.46]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 7C8AD5F3CC for ; Fri, 12 Jan 2018 03:36:04 +0000 (UTC) Received: by mail-it0-f46.google.com with SMTP id m11so444663iti.1 for ; Thu, 11 Jan 2018 19:36:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=qoYUJ3llIVm006zK2CiOerQ8t12mzyZvvbXJtzCL0FM=; b=uoUpbSGNxONd9s2N6b5MbugfMhVRlJaCLZsgWquNkozwTNlO28KbOJoJh32f7zIk8F prHC8+9s0uN+SahRBvRClWS+SV6a+3T64oFGIcMfVT0msvIEHNQR6LrjId1HblFZyWdM uvRZTpCIMeTlnf7cHBV3fNgyxEd2d1ejOWGuxC/L/Bvtkr7lPkgA28Rbrcc8aGVHdzlI 0owvVESZwunEKPkjAeQBm2gkWaYB3ZtBlNN9WUUQ1QGghR6cC+l6o5h04N8hju9y+E56 LmAySh/0fbkV3UsmQRX0kMyt2+womT8T4Haa9wgxKcGmuU61asY5yx15YTVJdPi3i+SX 3oAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=qoYUJ3llIVm006zK2CiOerQ8t12mzyZvvbXJtzCL0FM=; b=ORG5KZrtmpzrnSL1u4JBAWU6zpP2UHPe2U6q1P/O19I9DnhEq9XVVUFPIUed+Xjs3k 8+BC6DziSxlarR0tPpquK7q2LX9f6HHCNNbnuAfXj/J6Juo5UBYT04tqMjYuw44x4xhe p56UySo1lqJolVBMeVJu7Y9qLaDI6KrU7AICeeSxa+nCKWbJav6emJ5uI/fvsVZJIXS2 KBKNq7opQRjNf8lhws52uuSDx8jLqC7EgHg8jEw0flYSyHoFT3QJCWr1aChLmSVy+qsi pSWJ4NTArlp4jxSjmrIGpYOCS3gvWK+kH4wymfCImofsHVhdewWHhbHF/ZwN2MZ3EKPJ 6Vkg== X-Gm-Message-State: AKwxyteyvlrBVt7nIrlrD8iElWkxpktj2Azj30yVyRTn4Gnxp8B9iUzc UlQoi1U9iBJeIdtnEAKN9W7RfFrIKqfAG2WMFrg= X-Google-Smtp-Source: ACJfBoudj6UaQTkhddT5lI6zZc6gFASkE9h/SVOFbmPBsaXC/Ee/3lqRgBPiPPgExRhse44VKi77hE6Jjx4IhlwHz44= X-Received: by 10.36.11.85 with SMTP id 82mr3945797itd.143.1515728157623; Thu, 11 Jan 2018 19:35:57 -0800 (PST) MIME-Version: 1.0 Received: by 10.107.164.79 with HTTP; Thu, 11 Jan 2018 19:35:57 -0800 (PST) In-Reply-To: References: From: Chris Olivier Date: Thu, 11 Jan 2018 19:35:57 -0800 Message-ID: Subject: Re: Reduce 99% of your memory leaks with this simple trick! To: dev@mxnet.incubator.apache.org Content-Type: multipart/alternative; boundary="001a1140c3ee694a0005628bf64b" --001a1140c3ee694a0005628bf64b Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable I think there's tons of books on "best practices" already, so I wouldn't want to trouble you :) Are we running coverity static analysis? It catches those kinds of things. On Thu, Jan 11, 2018 at 7:04 PM, Bhavin Thaker wrote: > Would it make sense to have a developer best practices section on the > Apache wiki where such guidance can be documented for future reference? > > Bhavin Thaker. > > On Thu, Jan 11, 2018 at 9:56 AM Anirudh wrote: > > > Hi, > > > > > > I have been thinking about exception handling specifically inside spawn= ed > > threads. > > > > As Tianqi mentioned, there is already a mechanism with LOG(FATAL) or > CHECK > > for exception handling inside main > > > > Thread. For exception handling inside spawned threads I see two places: > > iterators and operators. > > > > > > > > For iterators, we can use exception_ptr to transport the exceptions fro= m > > child thread to the main thread. > > > > This can be implemented in the threadediter class template. Since > > PrefetchingIter is used by most iterators in MXNet, > > > > and this uses threadediter, we should be able to cover most of our use > > cases. > > > > > > > > For operators, I was thinking that we can transport the exception down > the > > dependency path. > > > > For example, when an exception is caught inside ExecuteOprBlock for a > > single operator, > > > > We store the exception_ptr in the operator. We then propagate the > > exception_ptr down to all the vars that the > > > > Operator writes to. Similarly, if an operator=E2=80=99s read vars has > exception_ptr > > attached to it, we propagate it down to the operator itself. > > > > > > > > We can then check if the var has an associated exception_ptr in > > wait_to_read. > > > > One problem I see with the approach is that even if an operator fails w= e > > may need to run subsequent operators. One way to avoid this > > > > Would be an onstart callback, which would mark the operator to not > execute > > if any of the read vars have an exception_ptr attached to it. > > > > > > > > Anirudh > > > > On Thu, Jan 11, 2018 at 9:02 AM, Tianqi Chen > > wrote: > > > > > I am all for RAII when possible in most of the code. The only reason > some > > > of the raw ptr occur in dmlc codebase was legacy-issue, and that can = be > > > resolved by wrapping returning ptr via unique_ptr or shared_ptr. One > > > notable property of RAII is exception safe, which makes the code hand= le > > > resource correctly when it throws in the middle. There are cases wher= e > > > memory allocation needs to be explicitly handled(e.g. GPU memory > > > management) and reused where we need to do explicit management when > > needed. > > > > > > > > > As for exception handling, we do have a mechanism for handling > > exceptions. > > > when you do LOG(FATAL) or CHECK is caught at the C API boundary, whic= h > > > translates to return code -1 and an error is thrown on the python > side. > > > Throwing exception from another thread is a more tricky thing, which > > > involves catching them in the engine, and usually, the state is not > > correct > > > in such case. But most of the cases when we need exception handling a= re > > the > > > simple case of opening a file and use CHECK should suffice. > > > > > > A better approach might be defining a new macro for errors intended t= o > > > throw to a user and handled correctly, something like DMLC_EXPECT. Bu= t > I > > > find it might be a burden to put developer to distinguish what should > be > > a > > > user error and a normal check, so we just use CHECK for now > > > > > > Tianqi > > > > > > On Thu, Jan 11, 2018 at 3:09 AM, Pedro Larroy < > > > pedro.larroy.lists@gmail.com> > > > wrote: > > > > > > > Hi > > > > > > > > I would like to encourage contributors to use RAII idioms in C++ > > > > whenever possible to avoid resource leaks. > > > > > > > > RAII is an ugly acronym that stands for Resource Acquisition Is > > > > Initialization, which basically means that you should almost never > use > > > > explicit new and delete operators and instead use std::make_shared, > > > > std::make_unique and std::vector and .data() for raw > > > > buffers. Also always allocating OS resources in constructors > releasing > > > > them in destructors such as file descriptors. > > > > > > > > Asides from forgetting to call delete on an allocation, explicit > > > > deletes are bad because an exception thrown in the middle prevents > > > > delete from running entirely. > > > > > > > > This helps a lot writing correct, secure and exception safe code > > > > without memory leaks. > > > > > > > > Another problem that I think is worth a discussion, is how to handl= e > > > > exceptions and errors. Right now, I don't think there's a good way = to > > > > throw an exception in some functions without crashing the python > > > > interpreter. I think we should come with a smart way to propagate > > > > exceptions from the library up to the user runtime (python, scala..= .) > > > > > > > > As an example of what I'm talking about is this suspicious code tha= t > I > > > > saw in a PR, which has several bugs in a few lines of code related = to > > > > what I'm discussing in this thread, crashing Python when trying to > > > > open a file that doesn't exist. (How to propagate an exception in > this > > > > case?) > > > > > > > > https://github.com/apache/incubator-mxnet/pull/9370/files > > > > > > > > Please excuse the clickbait subject, just trying to grab your > > > > attention in a humorous way now that the weekend is approaching. > > > > > > > > Pedro. > > > > > > > > > > --001a1140c3ee694a0005628bf64b--